From aragon at phat.za.net Sat Nov 1 00:39:41 2008 From: aragon at phat.za.net (Aragon Gouveia) Date: Sat Nov 1 00:39:48 2008 Subject: 6.4 RC1 locks up solid on first reboot In-Reply-To: <20081101064232.GA36710@phat.za.net> References: <200810230627.46478.freebsd-stable@dino.sk> <209111CB-326D-4758-80B2-2505CAE9BCDF@netconsonance.com> <20081025003750.GA42077@phat.za.net> <20081025012148.GA48297@icarus.home.lan> <20081025014218.GA47549@phat.za.net> <20081025080945.GA55413@icarus.home.lan> <20081101064232.GA36710@phat.za.net> Message-ID: <20081101073939.GA41841@phat.za.net> | By Aragon Gouveia | [ 2008-11-01 08:43 +0200 ] > > Anything is possible. Can you please rebuild your system, and the > > bootstraps (and don't forget to install them; bsdlabel -B ), > > without messing with optimisation flags? > > I just upgraded to 7.1 today. No luck, even with loader compiled without > optimisations, it is still unstable on my system for some reason. No > biggie for me, but possibly worthy of investigation if others have similar > problems. (my hardware is quite new still) > > I'm more than open to help troubleshoot this further if you (or anyone) has > any ideas on what to try next. Quick (progressive) update! I thought I'd see how loader behaves on my various boot CDs. My 7.0-RELEASE discs both work fine - loader doesn't freeze. My 7.1-BETA2 disc's loader behaves the same as the loader I compile. I've copied the loader binary from one of the 7.0-RELEASE discs onto my system and it works great just like that, and no more freezing. Perhaps a change that happened in RELENG_7 after 7.0-RELEASE is causing this? I've been experiencing it since my RELENG_7 checkout in May this year. Regards, Aragon From torfinn.ingolfsen at broadpark.no Sat Nov 1 12:06:08 2008 From: torfinn.ingolfsen at broadpark.no (Torfinn Ingolfsen) Date: Sat Nov 1 12:06:15 2008 Subject: FreeBSD 6.5-prerelease and if_re - patches needed? In-Reply-To: <20081003083443.GD71518@cdnetworks.co.kr> References: <20080921215704.eca7300b.torfinn.ingolfsen@broadpark.no> <20080922021022.GC26294@cdnetworks.co.kr> <20081002222542.849d5481.torfinn.ingolfsen@broadpark.no> <20081003083443.GD71518@cdnetworks.co.kr> Message-ID: <20081101200606.e50b5dbc.torfinn.ingolfsen@broadpark.no> On Fri, 03 Oct 2008 17:34:43 +0900 Pyun YongHyeon wrote: > Maybe you can use if_re.c/if_rlreg.h in RELENG_7 with minor > modification. I have tried that now. Unfortunately, the differences were too many - I never managed to get it to compile. Has anyone else created pathes for if_re for RELENG_6? -- Regards, Torfinn Ingolfsen From pyunyh at gmail.com Sat Nov 1 22:11:20 2008 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Sat Nov 1 22:11:26 2008 Subject: FreeBSD 6.5-prerelease and if_re - patches needed? In-Reply-To: <20081101200606.e50b5dbc.torfinn.ingolfsen@broadpark.no> References: <20080921215704.eca7300b.torfinn.ingolfsen@broadpark.no> <20080922021022.GC26294@cdnetworks.co.kr> <20081002222542.849d5481.torfinn.ingolfsen@broadpark.no> <20081003083443.GD71518@cdnetworks.co.kr> <20081101200606.e50b5dbc.torfinn.ingolfsen@broadpark.no> Message-ID: <20081102050915.GA90993@cdnetworks.co.kr> On Sat, Nov 01, 2008 at 08:06:06PM +0100, Torfinn Ingolfsen wrote: > On Fri, 03 Oct 2008 17:34:43 +0900 > Pyun YongHyeon wrote: > > > Maybe you can use if_re.c/if_rlreg.h in RELENG_7 with minor > > modification. > > I have tried that now. Unfortunately, the differences were too many - > I never managed to get it to compile. > > Has anyone else created pathes for if_re for RELENG_6? http://people.freebsd.org/~yongari/re/6.x/README Hope this helps. -- Regards, Pyun YongHyeon From peterjeremy at optushome.com.au Sun Nov 2 01:37:09 2008 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Sun Nov 2 01:37:16 2008 Subject: System hanging during dump In-Reply-To: <20081019083902.GP7782@deviant.kiev.zoral.com.ua> References: <20081015082428.GE26536@server.vk2pj.dyndns.org> <20081015083538.GA72190@icarus.home.lan> <48F65490.6040305@FreeBSD.org> <20081019032104.GB25796@server.vk2pj.dyndns.org> <20081019083902.GP7782@deviant.kiev.zoral.com.ua> Message-ID: <20081102083704.GH99398@server.vk2pj.dyndns.org> Sorry for the late reply. On 2008-Oct-19 11:39:02 +0300, Kostik Belousov wrote: >> I have built myself a looping 'ps -axl' which should let me gather more >> information if it does re-appear. (In the process, I've found that ps >> leaks memory, though that's not a problem until you wrap it in a loop). > >What memory ? Kernel one ? How did you noted this ? Could you add >vmstat -z and vmstat -m to the loop and watch what allocation grows ? ps(1) malloc's memory and doesn't free it. This isn't an issue in normal operation because it's a once-through program. I hacked ps to turn the guts of main() into a while(1){} loop and this showed the process was growing. There were a couple of superfluous strdup() calls that could be removed but I don't think it's worth making it exhaustively clean up after itself (my hacking included hard-wiring the options so I'm not sure my cleanup code is complete in the general case). As a low priority, I'll create a PR covering the strdup's. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081102/1eff112e/attachment.pgp From rwatson at FreeBSD.org Sun Nov 2 01:50:31 2008 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Nov 2 01:50:37 2008 Subject: Install issues with 7.x In-Reply-To: References: Message-ID: On Wed, 29 Oct 2008, Ryan wrote: > Hello, I purchased a new Clevo M860TU on the account that it ran linux very > well and was hoping it would fair the same on FreeBSD. Not so much, little > help? I posted this in mobile originally but though stable would be a better > choice. Don't know if it is more appropriate here or ACPI. > > I'm giving you as much information as I know how to get. as I cannot get > sysinstall to load I am having to type all these dmesg. The boot process is > hanging. This is all with 7.x, I can give 6.x if needed. xpt_config is the CAM configuration wait, so basically the system is waiting for a storage device to report back on whether it could be used as a root file system. I recently saw a similar report of problems involving a firewire controller on an nvidia motherboard following an upgrade to 7.x, and I wonder if you might try the following: see if 6.4 will install, and if so, install it. Then cvsup 7.x, and do a buildworld but not an installworld. This will let you build and experiment with 7.x kernels from a known-working environment. Make sure to keep a working 6.x kernel around -- I suggest something like "cp -r /boot/kernel /boot/kernel.good" before starting so you can always fall back to a good kernel. Now try building a 7.x kernel without USB or firewire support, and booting that? Also, it's worth checking there are no BIOS upgrades available for the motherboard... Robert N M Watson Computer Laboratory University of Cambridge > > Hardware: > Intel P9500 > 4gb DDR3-1066 > Nvidia 9800M GT > Atheros AR5006e > > FreeBSD 7.1-BETA2 > > These snippets of dmesg happen around the end where it hangs. > > 1. Default > > ... > cpu0: on acpi0 > ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj > 0xc6a02d40 [20070320] > ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving > operands for [OpcodeName unavailable] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed > [\_PR_.CPU0._OSC] (Node 0xc68556e0), AE_AML_INTERNAL > est0: on cpu0 > p4tcc0: on cpu0 > cpu1: on acpi0 > ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj > 0xc6a0e300 [20070320] > ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving > operands for [OpcodeName unavailable] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed > [\_PR_.CPU1._OSC] (Node 0xc685560), AE_AML_INTERNAL > est1: on cpu1 > p4tcc1: on cpu1 > ... > cpu0: Cx states changed > cpu1: Cx states changed > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > acd0: DVDR at ata3-master UDMA33 > GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config > > Then just stalls > > 2. No ACPI > > ... > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > acd0: DVDR at ata3-master UDMA33 > GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config > > Then just stalls > > 3. Safe Mode > > I can only tell you a little because console is spammed. It is the > same as no ACPI, but with an interrupt storm. > > ... > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > acd0: DVDR at ata3-master UDMA33 > GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config > > When it gets to the unknowns, this is spammed. > > interrupt storm detected on "irq10:"; throttling interrupt source > > Other than the interrupt storm spam, it is halted like the others. > > > 4. Single User Mode > > Same as 1, Default > > > 5. Verbose > > All I can tell you is what is spammed at the end. > > acpi: bad write to port 0x080 (32), val hex > > Where hex is ever increasing and loops when it hits 0xff01. I can also > see run_interrupt_driven_hooks message in all the spam. > > Using some googling if you add the sysctl before boot > > debug.acpi.block_bad_io=1 > > it might be of some help. This just leads to a never ending loop of > acpi errors - the scroll very fast and difficult to record might I > add! > > ... > acpi: bad write to port 0x080 (32), val hex > ACPI Exception (evregion-0529): AE_BAD_PARAMETER, Returned by handler > for [SystemIO] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed [\P8XH] (Node > 0xc6850a60), AE_BAD_PARAMETER > ACPI Error (psparse-0626): Method parse/execution failed [\_GPE._L01] > [20070320] > ACPI Exception (evgpe-0687): AE_BAD_PARAMETER, while evauating GPE > method [_L01] [20070320] > --repeat-- > ... > > > FreeBSD 7.0-REL > > 7.0 is a little different than 7.1. Messages are somewhat the same but > they happen near the beginning of dmesg instead of around the end. The > run_interrupt_driven_hooks issue is nonexistant as well, but it still > hangs. I'm guessing that's a debug tool more than an error. > > 1. Default > > ... > cpu0: on acpi0 > ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj > 0xc6862580 [20070320] > ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving > operands for [OpcodeName unavailable] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed > [\_PR_.CPU0._OSC] (Node 0xc682d580), AE_AML_INTERNAL > est0: on cpu0 > p4tcc0: on cpu0 > cpu1: on acpi0 > ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj > 0xc6861100 [20070320] > ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving > operands for [OpcodeName unavailable] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed > [\_PR_.CPU1._OSC] (Node 0xc682d4a0), AE_AML_INTERNAL > est1: on cpu1 > p4tcc1: on cpu1 > ... > cpu0: Cx states changed > cpu1: Cx states changed > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > acd0: DVDR at ata3-master UDMA33 > GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install > > Hangs. > > 2. No ACPI > > .. > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > .. > > Hangs. > > > 3. Safe Mode > > Same interrupt storm as 7.1-BETA2. > > ... > interrupt storm detected on "irq10:"; throttling interrupt source > --repeat-- > > 4. Single User Mode > > Same as 1. Default. > > > 5. Verbose > > Hang like normal, cannot see the ACPI errors since they fly off the > scroll lock buffer. > > ... > cpu0: Cx states changed > cpu1: Cx states changed > ... > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > ... > > > Thanks again. > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > From kostikbel at gmail.com Sun Nov 2 02:54:25 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Sun Nov 2 02:54:32 2008 Subject: System hanging during dump In-Reply-To: <20081102083704.GH99398@server.vk2pj.dyndns.org> References: <20081015082428.GE26536@server.vk2pj.dyndns.org> <20081015083538.GA72190@icarus.home.lan> <48F65490.6040305@FreeBSD.org> <20081019032104.GB25796@server.vk2pj.dyndns.org> <20081019083902.GP7782@deviant.kiev.zoral.com.ua> <20081102083704.GH99398@server.vk2pj.dyndns.org> Message-ID: <20081102105417.GZ18100@deviant.kiev.zoral.com.ua> On Sun, Nov 02, 2008 at 07:37:05PM +1100, Peter Jeremy wrote: > Sorry for the late reply. > > On 2008-Oct-19 11:39:02 +0300, Kostik Belousov wrote: > >> I have built myself a looping 'ps -axl' which should let me gather more > >> information if it does re-appear. (In the process, I've found that ps > >> leaks memory, though that's not a problem until you wrap it in a loop). > > > >What memory ? Kernel one ? How did you noted this ? Could you add > >vmstat -z and vmstat -m to the loop and watch what allocation grows ? > > ps(1) malloc's memory and doesn't free it. This isn't an issue in > normal operation because it's a once-through program. I hacked ps to > turn the guts of main() into a while(1){} loop and this showed the > process was growing. There were a couple of superfluous strdup() > calls that could be removed but I don't think it's worth making it > exhaustively clean up after itself (my hacking included hard-wiring > the options so I'm not sure my cleanup code is complete in the general > case). As a low priority, I'll create a PR covering the strdup's. Thank you for clarification. Please, Cc: me with a PR, I will look at it. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081102/6c257b75/attachment.pgp From torfinn.ingolfsen at broadpark.no Sun Nov 2 03:29:08 2008 From: torfinn.ingolfsen at broadpark.no (Torfinn Ingolfsen) Date: Sun Nov 2 03:29:15 2008 Subject: FreeBSD 6.5-prerelease and if_re - patches needed? In-Reply-To: <20081102050915.GA90993@cdnetworks.co.kr> References: <20080921215704.eca7300b.torfinn.ingolfsen@broadpark.no> <20080922021022.GC26294@cdnetworks.co.kr> <20081002222542.849d5481.torfinn.ingolfsen@broadpark.no> <20081003083443.GD71518@cdnetworks.co.kr> <20081101200606.e50b5dbc.torfinn.ingolfsen@broadpark.no> <20081102050915.GA90993@cdnetworks.co.kr> Message-ID: <20081102122906.eeceebd8.torfinn.ingolfsen@broadpark.no> On Sun, 02 Nov 2008 14:09:15 +0900 Pyun YongHyeon wrote: > http://people.freebsd.org/~yongari/re/6.x/README > > Hope this helps. Yes it does, thanks! On boot, trherer is a noticable delay (tens of seconds) after printing these lines: re0: port 0xee00-0xeeff mem 0xfdfff000-0xfdffffff irq 19 at device 0.0 on pci2 re0: turning off MSI enable bit. re0: Chip rev. 0x38000000 re0: MAC rev. 0x00000000 The original didn't have that delay. Otrher than that it works much better. Details: root@kg-vm# uname -a FreeBSD kg-vm.kg4.no 6.4-PRERELEASE FreeBSD 6.4-PRERELEASE #3: Sun Nov 2 10:44:32 CET 2008 root@kg-vm.kg4.no:/usr/obj/usr/src/sys/SMP amd64 root@kg-vm# pciconf -lv | grep re0 -A 4 re0@pci2:0:0: class=0x020000 card=0x81aa1043 chip=0x816810ec rev=0x01 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC' class = network subclass = ethernet Note: I haven't testet if_rl, so I don't know how this patch affects that. I'll get back with a note on stability sometime next week (after I have done losts of data transfers to and from this box). -- Regards, Torfinn Ingolfsen From jbolivar at cantv.net Sun Nov 2 09:36:59 2008 From: jbolivar at cantv.net (Julian Bolivar) Date: Sun Nov 2 09:37:06 2008 Subject: /stand/sysinstall freezed on FreeBSD 7.1 install Message-ID: <490DDABA.9000304@cantv.net> Dear Friends, I try to install FreeBSD 7.1 AMD64 beta 2 in an Intel Core 2 Duo and motherboard MSI 975X Platinum V.2m, but when /stand/sysinstall try to start from the installation CD, the system freezed and don't continue the install process. Anyone know how to solved this problem to install it Thanks and regards, Julian Bolivar From pyunyh at gmail.com Sun Nov 2 16:49:20 2008 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Sun Nov 2 16:49:27 2008 Subject: FreeBSD 6.5-prerelease and if_re - patches needed? In-Reply-To: <20081102122906.eeceebd8.torfinn.ingolfsen@broadpark.no> References: <20080921215704.eca7300b.torfinn.ingolfsen@broadpark.no> <20080922021022.GC26294@cdnetworks.co.kr> <20081002222542.849d5481.torfinn.ingolfsen@broadpark.no> <20081003083443.GD71518@cdnetworks.co.kr> <20081101200606.e50b5dbc.torfinn.ingolfsen@broadpark.no> <20081102050915.GA90993@cdnetworks.co.kr> <20081102122906.eeceebd8.torfinn.ingolfsen@broadpark.no> Message-ID: <20081103004714.GB94302@cdnetworks.co.kr> On Sun, Nov 02, 2008 at 12:29:06PM +0100, Torfinn Ingolfsen wrote: > On Sun, 02 Nov 2008 14:09:15 +0900 > Pyun YongHyeon wrote: > > > http://people.freebsd.org/~yongari/re/6.x/README > > > > Hope this helps. > > Yes it does, thanks! > > On boot, trherer is a noticable delay (tens of seconds) after printing > these lines: > re0: port 0xee00-0xeeff mem 0xfdfff000-0xfdffffff irq 19 at device 0.0 on pci2 > re0: turning off MSI enable bit. > re0: Chip rev. 0x38000000 > re0: MAC rev. 0x00000000 > > The original didn't have that delay. I've changed to have re(4) wait the completion of DMAable memory allocation during bus_dma cleanups. The delay you've seen may be related with that change. Previously it just failed to load the driver if there is no available memory at the time of driver loading. However I guess that delay wouldn't happen if the driver is statically linked into kernel. Did you use kernel module? In theory PCIe variants of RealTek controllers would work with DAC so I could alleviate memory allocation restrictions imposed by bus_dma by allowing 64bits DMA addressing. Since I don't have PCIe based RealTek controllers and no datasheets are available for PCIe based controllers it's somewaht difficult to chage current allocation restrictions. > Otrher than that it works much better. Details: > root@kg-vm# uname -a > FreeBSD kg-vm.kg4.no 6.4-PRERELEASE FreeBSD 6.4-PRERELEASE #3: Sun Nov 2 10:44:32 CET 2008 root@kg-vm.kg4.no:/usr/obj/usr/src/sys/SMP amd64 > root@kg-vm# pciconf -lv | grep re0 -A 4 > re0@pci2:0:0: class=0x020000 card=0x81aa1043 chip=0x816810ec rev=0x01 hdr=0x00 > vendor = 'Realtek Semiconductor' > device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC' > class = network > subclass = ethernet > > Note: I haven't testet if_rl, so I don't know how this patch affects that. rl(4) has a single change to build with updated if_rlreg.h and I don't think that would affect any stability of rl(4). > I'll get back with a note on stability sometime next week (after I have done losts of data transfers to and from this box). Ok. -- Regards, Pyun YongHyeon From pyunyh at gmail.com Sun Nov 2 20:11:04 2008 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Sun Nov 2 20:11:12 2008 Subject: can not wake on lan after halt -p (or shutdown -p now) on releng_7 and releng_7_0 In-Reply-To: <20081014064456.GE14769@cdnetworks.co.kr> References: <596673353.20081006181334@pulsar.bg> <20081010012058.GA99376@cdnetworks.co.kr> <1948191744.20081010114326@pulsar.bg> <20081014064456.GE14769@cdnetworks.co.kr> Message-ID: <20081103040859.GD94302@cdnetworks.co.kr> On Tue, Oct 14, 2008 at 03:44:56PM +0900, To Georgi Iovchev wrote: > On Fri, Oct 10, 2008 at 11:43:26AM +0300, Georgi Iovchev wrote: > > > > > > -- > > Friday, October 10, 2008, 4:20:58 AM: > > > > > On Mon, Oct 06, 2008 at 06:13:34PM +0300, Georgi Iovchev wrote: > > >> Hello list > > >> > > >> I have a shutdown problem. I have a machine with gigabyte GA-G33M-DS2R > > >> motherboard. Integrated network card is Realtek 8111B. > > >> I can not wake the computer after I shutdown it from FreeBSD. > > >> It is a dualboot system - windows xp and freebsd. If I shutdown the > > >> computer from windows - later I can wake it up with magic packet. Even > > >> if i shutdown the machine on the boot menu with the power button - than > > >> later I can wake on lan. The only situation where I CANNOT wake it is > > >> when I shutdown the machine from freebsd (halt -p). > > >> > > >> First I tested with 7.0-RELEASE-p5 amd64 (RELENG_7_0) and than I > > >> upgraded to 7.1 PRERELASE amd64 (RELENG_7). I also tested with two > > >> network cards - the integrated one Realtek 8111B and another one Intel > > >> PRO1000PT PCI-E with WOL enabled. > > >> > > > > > Don't know WOL issue of em(4) but re(4) should respond to WOL. > > > 7.0-RELEASE had no support for WOL so RELENG_7 or 7.1-PRERELEASE > > > should be used to experiment WOL. > > Now I am using 7.1-prerelase > > > > >> With both nics and both freebsd versions the situation is the same - > > >> after shutdown from bsd the computer is not able to wake on lan. The > > > > > Because you can wake up your sytem from Windows shutdown I think > > > your BIOS is already configured to allow wakeup from WOL. Would > > > you compare ethernet address of re(4) to Winwods? Have you tried to > > > send Magic packets to FreeBSD box? > > I have tried sending magic packets from another bsd machine. I am > > using net/wol. I also tried to send magic packets from windows machine > > using 3 different programs. > > > > > You may also try suspend your box with acpiconf and resume from WOL. > > I cant. > > > > [root@backup ~]# acpiconf -s 5 > > acpiconf: invalid sleep type (5) > > > > Actually I cant enter in any sleep state > > [root@backup ~]# acpiconf -s 4 > > acpiconf: request sleep type (4) failed: Operation not supported > > [root@backup ~]# acpiconf -s 3 > > acpiconf: request sleep type (3) failed: Operation not supported > > [root@backup ~]# acpiconf -s 2 > > acpiconf: request sleep type (2) failed: Operation not supported > > [root@backup ~]# acpiconf -s 1 > > acpiconf: request sleep type (1) failed: Operation not supported > > > > I am using generic kernel with little modifications, (generally i have > > commented many unused drivers - raid, if_....) Acpi is in generic > > kernel now. > > > > I even tried to wake the machine with magic packet after shutdown -h. > > But still no luck. > > > > > > >> indication on the switch port says that after shut down there is > > >> active link. > > >> > > > > > That indicates the controller is alive so it shall respond to WOL > > > if it was correctly configured to receive WOL packets. Have you > > > tried to send Magic packets to FreeBSD box? > > > > >> Here is some information after last update: > > >> > > >> [root@backup ~]# uname -a > > >> FreeBSD backup.pulsar.bg 7.1-PRERELEASE FreeBSD > > >> 7.1-PRERELEASE #1: Mon Oct 6 17:01:26 EEST 2008 > > >> root@backup.pulsar.bg:/usr/obj/usr/src/sys/MYCONF amd64 > > >> > > >> [root@backup ~]# pciconf -lv > > >> ... > > >> re0@pci0:3:0:0: class=0x020000 card=0xe0001458 > > >> chip=0x816810ec rev=0x01 hdr=0x00 > > >> vendor = 'Realtek Semiconductor' > > >> device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC' > > >> class = network > > >> subclass = ethernet > > >> ... > > > > > Show me dmesg output pertinent to re(4). > > > > re0: port 0xd000-0xd0ff mem 0xf2000000-0xf2000fff irq 17 at device 0.0 on pci3 > > re0: turning off MSI enable bit. > > re0: Chip rev. 0x38000000 > > re0: MAC rev. 0x00000000 > > miibus0: on re0 > > rgephy0: PHY 1 on miibus0 > > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto > > re0: Ethernet address: 00:1f:d0:24:19:e9 > > re0: [FILTER] > > > > It looks like your chip is RTL8168B and I don't see any errors in > WOL related code of re(4). :-( > Can you check the resolved link speed/duplex of FreeBSD box after > shutdown?(You can enter to your switch menu and see the port > status.) > How about sending WOL packets over direct-connected UTP cable > without using switch? > Here is WOL patch which may fix the issye. Would you try the following patch and let me know whether WOL works or not? http://people.freebsd.org/~yongari/re/re.phy.patch.20081103 -- Regards, Pyun YongHyeon From randy at psg.com Sun Nov 2 22:46:09 2008 From: randy at psg.com (Randy Bush) Date: Sun Nov 2 22:46:18 2008 Subject: installworld chflags failures Message-ID: <490E9E2F.2010403@psg.com> i386, fresh cvsup FreeBSD 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #14: Sun Nov 2 12:13:46 GMT 2008 root@psg.com:/usr/obj/usr/src/sys/PSG i386 single luser mode over serial console :/usr/src# time make installworld 2>&1 > installworld.log install: /usr/lib/libkse.so.3: chflags: Operation not supported install: /usr/lib/librt.so.1: chflags: Operation not supported chflags: /usr/bin/chpass: Operation not supported install: /usr/bin/login: chflags: Operation not supported install: /usr/bin/opieinfo: chflags: Operation not supported install: /usr/bin/opiepasswd: chflags: Operation not supported chflags: /usr/bin/passwd: Operation not supported install: /usr/bin/rlogin: chflags: Operation not supported install: /usr/bin/rsh: chflags: Operation not supported install: /usr/bin/su: chflags: Operation not supported install: /usr/bin/crontab: chflags: Operation not supported install: /usr/sbin/sliplogin: chflags: Operation not supported this is new and different, and i am worried. no clue in UPDATING. no clue in head. randy From koitsu at FreeBSD.org Sun Nov 2 22:53:08 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Sun Nov 2 22:53:15 2008 Subject: installworld chflags failures In-Reply-To: <490E9E2F.2010403@psg.com> References: <490E9E2F.2010403@psg.com> Message-ID: <20081103065306.GA13398@icarus.home.lan> On Mon, Nov 03, 2008 at 03:46:07PM +0900, Randy Bush wrote: > i386, fresh cvsup > > FreeBSD 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #14: Sun Nov 2 12:13:46 > GMT 2008 root@psg.com:/usr/obj/usr/src/sys/PSG i386 > > single luser mode over serial console > > :/usr/src# time make installworld 2>&1 > installworld.log > install: /usr/lib/libkse.so.3: chflags: Operation not supported > install: /usr/lib/librt.so.1: chflags: Operation not supported > chflags: /usr/bin/chpass: Operation not supported > install: /usr/bin/login: chflags: Operation not supported > install: /usr/bin/opieinfo: chflags: Operation not supported > install: /usr/bin/opiepasswd: chflags: Operation not supported > chflags: /usr/bin/passwd: Operation not supported > install: /usr/bin/rlogin: chflags: Operation not supported > install: /usr/bin/rsh: chflags: Operation not supported > install: /usr/bin/su: chflags: Operation not supported > install: /usr/bin/crontab: chflags: Operation not supported > install: /usr/sbin/sliplogin: chflags: Operation not supported > > this is new and different, and i am worried. no clue in UPDATING. no > clue in head. Sounds like kern.securelevel is biting you, or possibly some very odd filesystem mounting flags. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From randy at psg.com Sun Nov 2 23:00:03 2008 From: randy at psg.com (Randy Bush) Date: Sun Nov 2 23:00:11 2008 Subject: installworld chflags failures In-Reply-To: <20081103065306.GA13398@icarus.home.lan> References: <490E9E2F.2010403@psg.com> <20081103065306.GA13398@icarus.home.lan> Message-ID: <490EA171.8050600@psg.com> Jeremy Chadwick wrote: > On Mon, Nov 03, 2008 at 03:46:07PM +0900, Randy Bush wrote: >> i386, fresh cvsup >> >> FreeBSD 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #14: Sun Nov 2 12:13:46 >> GMT 2008 root@psg.com:/usr/obj/usr/src/sys/PSG i386 >> >> single luser mode over serial console >> >> :/usr/src# time make installworld 2>&1 > installworld.log >> install: /usr/lib/libkse.so.3: chflags: Operation not supported >> install: /usr/lib/librt.so.1: chflags: Operation not supported >> chflags: /usr/bin/chpass: Operation not supported >> install: /usr/bin/login: chflags: Operation not supported >> install: /usr/bin/opieinfo: chflags: Operation not supported >> install: /usr/bin/opiepasswd: chflags: Operation not supported >> chflags: /usr/bin/passwd: Operation not supported >> install: /usr/bin/rlogin: chflags: Operation not supported >> install: /usr/bin/rsh: chflags: Operation not supported >> install: /usr/bin/su: chflags: Operation not supported >> install: /usr/bin/crontab: chflags: Operation not supported >> install: /usr/sbin/sliplogin: chflags: Operation not supported >> >> this is new and different, and i am worried. no clue in UPDATING. no >> clue in head. > > Sounds like kern.securelevel is biting you, exactly. but in single user root? i thought that was not supposed to happen. certainly did not use to happen. randy From koitsu at FreeBSD.org Sun Nov 2 23:04:05 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Sun Nov 2 23:04:12 2008 Subject: installworld chflags failures In-Reply-To: <490EA171.8050600@psg.com> References: <490E9E2F.2010403@psg.com> <20081103065306.GA13398@icarus.home.lan> <490EA171.8050600@psg.com> Message-ID: <20081103070403.GA13649@icarus.home.lan> On Mon, Nov 03, 2008 at 04:00:01PM +0900, Randy Bush wrote: > Jeremy Chadwick wrote: > > On Mon, Nov 03, 2008 at 03:46:07PM +0900, Randy Bush wrote: > >> i386, fresh cvsup > >> > >> FreeBSD 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #14: Sun Nov 2 12:13:46 > >> GMT 2008 root@psg.com:/usr/obj/usr/src/sys/PSG i386 > >> > >> single luser mode over serial console > >> > >> :/usr/src# time make installworld 2>&1 > installworld.log > >> install: /usr/lib/libkse.so.3: chflags: Operation not supported > >> install: /usr/lib/librt.so.1: chflags: Operation not supported > >> chflags: /usr/bin/chpass: Operation not supported > >> install: /usr/bin/login: chflags: Operation not supported > >> install: /usr/bin/opieinfo: chflags: Operation not supported > >> install: /usr/bin/opiepasswd: chflags: Operation not supported > >> chflags: /usr/bin/passwd: Operation not supported > >> install: /usr/bin/rlogin: chflags: Operation not supported > >> install: /usr/bin/rsh: chflags: Operation not supported > >> install: /usr/bin/su: chflags: Operation not supported > >> install: /usr/bin/crontab: chflags: Operation not supported > >> install: /usr/sbin/sliplogin: chflags: Operation not supported > >> > >> this is new and different, and i am worried. no clue in UPDATING. no > >> clue in head. > > > > Sounds like kern.securelevel is biting you, > > exactly. but in single user root? i thought that was not supposed to > happen. certainly did not use to happen. Did you reboot into single-user, or did you simply drop from multi-user into single-user by killing init? And what does "sysctl kern.securelevel" show you while in single-user mode? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From randy at psg.com Sun Nov 2 23:18:50 2008 From: randy at psg.com (Randy Bush) Date: Sun Nov 2 23:18:57 2008 Subject: installworld chflags failures In-Reply-To: <20081103070403.GA13649@icarus.home.lan> References: <490E9E2F.2010403@psg.com> <20081103065306.GA13398@icarus.home.lan> <490EA171.8050600@psg.com> <20081103070403.GA13649@icarus.home.lan> Message-ID: <490EA5D7.3020307@psg.com> > Did you reboot into single-user, or did you simply drop from > multi-user into single-user by killing init? rebooted and was in out of band on serial console > And what does "sysctl kern.securelevel" show you while in single-user > mode? doh. i shoulda looked, eh? Enter full pathname of shell or RETURN for /bin/sh: id: not found grep: not found :/> sysctl kern.securelevel kern.securelevel: -1 :/> /etc/rc.d/hostid start Setting hostuuid: 6b70e4ac-874d-11dc-873e-003048293754. Setting hostid: 0x5ef5842d. :/> /etc/rc.d/zfs start :/> sysctl kern.securelevel kern.securelevel: -1 :/> cd /usr/src :/usr/src> bash :/usr/src# time make installworld 2>&1 > installworld.log install: /usr/lib/libkse.so.3: chflags: Operation not supported install: /usr/lib/librt.so.1: chflags: Operation not supported chflags: /usr/bin/chpass: Operation not supported install: /usr/bin/login: chflags: Operation not supported install: /usr/bin/opieinfo: chflags: Operation not supported install: /usr/bin/opiepasswd: chflags: Operation not supported chflags: /usr/bin/passwd: Operation not supported install: /usr/bin/rlogin: chflags: Operation not supported install: /usr/bin/rsh: chflags: Operation not supported install: /usr/bin/su: chflags: Operation not supported install: /usr/bin/crontab: chflags: Operation not supported install: /usr/sbin/sliplogin: chflags: Operation not supported real 2m7.290s user 0m30.610s sys 0m40.766s :/usr/src# sysctl kern.securelevel kern.securelevel: -1 From koitsu at FreeBSD.org Sun Nov 2 23:21:47 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Sun Nov 2 23:21:54 2008 Subject: installworld chflags failures In-Reply-To: <490EA5D7.3020307@psg.com> References: <490E9E2F.2010403@psg.com> <20081103065306.GA13398@icarus.home.lan> <490EA171.8050600@psg.com> <20081103070403.GA13649@icarus.home.lan> <490EA5D7.3020307@psg.com> Message-ID: <20081103072145.GA14071@icarus.home.lan> On Mon, Nov 03, 2008 at 04:18:47PM +0900, Randy Bush wrote: > > Did you reboot into single-user, or did you simply drop from > > multi-user into single-user by killing init? > > rebooted and was in out of band on serial console > > > And what does "sysctl kern.securelevel" show you while in single-user > > mode? > > doh. i shoulda looked, eh? > > > > Enter full pathname of shell or RETURN for /bin/sh: > id: not found > grep: not found > :/> sysctl kern.securelevel > kern.securelevel: -1 > :/> /etc/rc.d/hostid start > Setting hostuuid: 6b70e4ac-874d-11dc-873e-003048293754. > Setting hostid: 0x5ef5842d. > :/> /etc/rc.d/zfs start > :/> sysctl kern.securelevel > kern.securelevel: -1 > :/> cd /usr/src > :/usr/src> bash > :/usr/src# time make installworld 2>&1 > installworld.log > install: /usr/lib/libkse.so.3: chflags: Operation not supported > install: /usr/lib/librt.so.1: chflags: Operation not supported > chflags: /usr/bin/chpass: Operation not supported > install: /usr/bin/login: chflags: Operation not supported > install: /usr/bin/opieinfo: chflags: Operation not supported > install: /usr/bin/opiepasswd: chflags: Operation not supported > chflags: /usr/bin/passwd: Operation not supported > install: /usr/bin/rlogin: chflags: Operation not supported > install: /usr/bin/rsh: chflags: Operation not supported > install: /usr/bin/su: chflags: Operation not supported > install: /usr/bin/crontab: chflags: Operation not supported > install: /usr/sbin/sliplogin: chflags: Operation not supported Is /usr a ZFS filesystem or part of a zpool? If so, possibly you have some ZFS settings on your pool or filesystem which are inhibiting the ability to use chflags in some way? "zfs get all" will help. Otherwise, I don't have any immediate ideas. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From torfinn.ingolfsen at broadpark.no Mon Nov 3 01:01:37 2008 From: torfinn.ingolfsen at broadpark.no (Torfinn Ingolfsen) Date: Mon Nov 3 01:01:48 2008 Subject: FreeBSD 6.5-prerelease and if_re - patches needed? In-Reply-To: <20081103004714.GB94302@cdnetworks.co.kr> References: <20080921215704.eca7300b.torfinn.ingolfsen@broadpark.no> <20080922021022.GC26294@cdnetworks.co.kr> <20081002222542.849d5481.torfinn.ingolfsen@broadpark.no> <20081003083443.GD71518@cdnetworks.co.kr> <20081101200606.e50b5dbc.torfinn.ingolfsen@broadpark.no> <20081102050915.GA90993@cdnetworks.co.kr> <20081102122906.eeceebd8.torfinn.ingolfsen@broadpark.no> <20081103004714.GB94302@cdnetworks.co.kr> Message-ID: <20081103100121.4f551f1c.torfinn.ingolfsen@broadpark.no> On Mon, 03 Nov 2008 09:47:14 +0900 Pyun YongHyeon wrote: > I've changed to have re(4) wait the completion of DMAable memory > allocation during bus_dma cleanups. The delay you've seen may be > related with that change. Previously it just failed to load the > driver if there is no available memory at the time of driver > loading. However I guess that delay wouldn't happen if the driver > is statically linked into kernel. > Did you use kernel module? No, the driver is compiled into the kernel: root@kg-vm# ifconfig re0 re0: flags=8843 mtu 1500 options=1b inet 10.1.150.15 netmask 0xffff0000 broadcast 10.1.255.255 ether 00:1d:60:2c:80:f0 media: Ethernet autoselect (100baseTX ) status: active root@kg-vm# kldstat Id Refs Address Size Name 1 3 0xffffffff80100000 aa37d8 kernel 2 1 0xffffffff80ba4000 1a850 snd_hda.ko 3 2 0xffffffff80bbf000 35c80 sound.ko root@kg-vm# HTH -- Regards, Torfinn Ingolfsen From torfinn.ingolfsen at broadpark.no Mon Nov 3 01:06:05 2008 From: torfinn.ingolfsen at broadpark.no (Torfinn Ingolfsen) Date: Mon Nov 3 01:06:12 2008 Subject: installworld chflags failures In-Reply-To: <490EA5D7.3020307@psg.com> References: <490E9E2F.2010403@psg.com> <20081103065306.GA13398@icarus.home.lan> <490EA171.8050600@psg.com> <20081103070403.GA13649@icarus.home.lan> <490EA5D7.3020307@psg.com> Message-ID: <20081103100603.19d24c10.torfinn.ingolfsen@broadpark.no> On Mon, 03 Nov 2008 16:18:47 +0900 Randy Bush wrote: > :/> cd /usr/src > :/usr/src> bash Hmm, what happens if you do _not_ use bash here? bash is non-standard for a FreeBSD install (the procedure), so it might bite you. Or it might not. HTH -- Regards, Torfinn Ingolfsen From randy at psg.com Mon Nov 3 01:51:46 2008 From: randy at psg.com (Randy Bush) Date: Mon Nov 3 01:51:52 2008 Subject: installworld chflags failures In-Reply-To: <20081103072145.GA14071@icarus.home.lan> References: <490E9E2F.2010403@psg.com> <20081103065306.GA13398@icarus.home.lan> <490EA171.8050600@psg.com> <20081103070403.GA13649@icarus.home.lan> <490EA5D7.3020307@psg.com> <20081103072145.GA14071@icarus.home.lan> Message-ID: <490EC9AF.7020804@psg.com> > Is /usr a ZFS filesystem or part of a zpool? If so, possibly you have > some ZFS settings on your pool or filesystem which are inhibiting the > ability to use chflags in some way? "zfs get all" will help. aha! i suspect you care correct. but i can not decipher from man zfs which property it is. http://wiki.freebsd.org/ZFS tells me that chflags(2) support is done. still googling, but nothing exciting. xattr is extended attributes, and should default to on but is temp off for unknown reasons. is that it? randy -- # zfs get all tank/usr NAME PROPERTY VALUE SOURCE tank/usr type filesystem - tank/usr creation Wed Oct 8 1:02 2008 - tank/usr used 63.0G - tank/usr available 164G - tank/usr referenced 14.6G - tank/usr compressratio 1.00x - tank/usr mounted yes - tank/usr quota none default tank/usr reservation none default tank/usr recordsize 128K default tank/usr mountpoint /usr local tank/usr sharenfs off default tank/usr checksum on default tank/usr compression off default tank/usr atime on default tank/usr devices on default tank/usr exec on default tank/usr setuid on default tank/usr readonly off default tank/usr jailed off default tank/usr snapdir hidden default tank/usr aclmode groupmask default tank/usr aclinherit secure default tank/usr canmount on default tank/usr shareiscsi off default tank/usr xattr off temporary tank/usr copies 1 default From delphij at delphij.net Mon Nov 3 02:07:39 2008 From: delphij at delphij.net (Xin LI) Date: Mon Nov 3 02:07:46 2008 Subject: [Call for testers] bce(4) update Message-ID: <490ECD5E.8060203@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear 7-STABLE users, I would like to call for test for a patchset that is intended for MFC to 7-STABLE before 7.1-RELEASE. The patchset can be viewed/downloaded here: http://www.delphij.net/bce.diff (~1.2MB) For those with slow network, you can download the compressed version: http://www.delphij.net/bce.diff.bz2 (~210KB) In order to apply the patch, you will need to update your source tree to latest RELENG_7, then, cd into /usr/src/sys and use patch < ~/bce.diff (assume that you have downloaded it to your home directory), then recompile your kernel (or module). This patchset would add some new hardware support, plus a bunch of bugfixes done by davidch@ in the past months. Since this is a relatively big change, I would appreciate tests from bce(4) hardware owners, your help will aid us to make FreeBSD 7.1 a better release. For your convenience, here is a summary for the change: r176448,178132,178853,179436,179695,179771,182293 r176448 (davidch) - Added loose RX MTU functionality to allow frames larger than 1500 bytes to be accepted even though the interface MTU is set to 1500. - Implemented new TCP header splitting/jumbo frame support which uses two chains for receive traffic rather than the original single receive chain. - Added additional debug support code. r178132 (davidch) - Fixed a problem with the send chain consumer index which would cause TX traffic to sit in the send chain until a received packet kick started the interrupt handler. This would cause extremely slow performance when used with NFS over UDP. - Removed untested polling code. - Updated copyright year in the file header. - Removed inadvertent ^M's created by DOS text editor. r178853 (scottl) The BCE chips appear to have an undocumented requirement that RX frames be aligned on an 8 byte boundary. Prior to rev 1.36 (now r176448) this wasn't a problem because mbuf clusters tend be naturally aligned. The switch to using split buffers with the first buffer being the embedded data area of the mbuf has broken this assumption, at least on i386, causing a complete failure of RX functionality. Fix this for now by using a full cluster for the first RX buffer. A more sophisticated approach could be done with the old buffer scheme to realign the m_data pointer with m_adj(), but I'm also not clear on performance benefits of this old scheme or the performance implications of adding an m_adj() call to every allocation. r179436 (jhb) Trim an extra semi-colon. r179695 (davidch) - Fixed kern/123696 by increasing firmware timeout value from 100 to 1000. - Fixed a problem on i386 architecture when using split header/jumbo frame firmware caused by hardware alignment requirements. - Added #define BCE_USE_SPLIT_HEADER to allow the feature to be enabled/disabled. Enabled by default. PR: kern/123696 r179771 (davidch) - Added support for BCM5709 and BCM5716 controllers. r182293 (davidch) - Updated support for 5716. - Added some additional code for debug builds. - Fixed a problem printing physical memory on 64bit system during debugging. - Modified some of the context memory and mailbox register names to more clearly distinguish their use. - Added memory barriers for Intel CPUs when accessing host memory data structures which are written by hardware. - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkOzV4ACgkQi+vbBBjt66C3IgCgi5OVuzAIlzJ/cgSpZuWPdqvZ FAEAn0xUK/gp5VNwisDgcbGzfPh7jig4 =1CGv -----END PGP SIGNATURE----- From geo at pulsar.bg Mon Nov 3 02:43:06 2008 From: geo at pulsar.bg (Georgi Iovchev) Date: Mon Nov 3 04:24:27 2008 Subject: can not wake on lan after halt -p (or shutdown -p now) on releng_7 and releng_7_0 In-Reply-To: <20081103040859.GD94302@cdnetworks.co.kr> References: <596673353.20081006181334@pulsar.bg> <20081010012058.GA99376@cdnetworks.co.kr> <1948191744.20081010114326@pulsar.bg> <20081014064456.GE14769@cdnetworks.co.kr> <20081103040859.GD94302@cdnetworks.co.kr> Message-ID: <490ED5E7.40606@pulsar.bg> Pyun YongHyeon wrote: > On Tue, Oct 14, 2008 at 03:44:56PM +0900, To Georgi Iovchev wrote: > > On Fri, Oct 10, 2008 at 11:43:26AM +0300, Georgi Iovchev wrote: > > > > > > > > > -- > > > Friday, October 10, 2008, 4:20:58 AM: > > > > > > > On Mon, Oct 06, 2008 at 06:13:34PM +0300, Georgi Iovchev wrote: > > > >> Hello list > > > >> > > > >> I have a shutdown problem. I have a machine with gigabyte GA-G33M-DS2R > > > >> motherboard. Integrated network card is Realtek 8111B. > > > >> I can not wake the computer after I shutdown it from FreeBSD. > > > >> It is a dualboot system - windows xp and freebsd. If I shutdown the > > > >> computer from windows - later I can wake it up with magic packet. Even > > > >> if i shutdown the machine on the boot menu with the power button - than > > > >> later I can wake on lan. The only situation where I CANNOT wake it is > > > >> when I shutdown the machine from freebsd (halt -p). > > > >> > > > >> First I tested with 7.0-RELEASE-p5 amd64 (RELENG_7_0) and than I > > > >> upgraded to 7.1 PRERELASE amd64 (RELENG_7). I also tested with two > > > >> network cards - the integrated one Realtek 8111B and another one Intel > > > >> PRO1000PT PCI-E with WOL enabled. > > > >> > > > > > > > Don't know WOL issue of em(4) but re(4) should respond to WOL. > > > > 7.0-RELEASE had no support for WOL so RELENG_7 or 7.1-PRERELEASE > > > > should be used to experiment WOL. > > > Now I am using 7.1-prerelase > > > > > > >> With both nics and both freebsd versions the situation is the same - > > > >> after shutdown from bsd the computer is not able to wake on lan. The > > > > > > > Because you can wake up your sytem from Windows shutdown I think > > > > your BIOS is already configured to allow wakeup from WOL. Would > > > > you compare ethernet address of re(4) to Winwods? Have you tried to > > > > send Magic packets to FreeBSD box? > > > I have tried sending magic packets from another bsd machine. I am > > > using net/wol. I also tried to send magic packets from windows machine > > > using 3 different programs. > > > > > > > You may also try suspend your box with acpiconf and resume from WOL. > > > I cant. > > > > > > [root@backup ~]# acpiconf -s 5 > > > acpiconf: invalid sleep type (5) > > > > > > Actually I cant enter in any sleep state > > > [root@backup ~]# acpiconf -s 4 > > > acpiconf: request sleep type (4) failed: Operation not supported > > > [root@backup ~]# acpiconf -s 3 > > > acpiconf: request sleep type (3) failed: Operation not supported > > > [root@backup ~]# acpiconf -s 2 > > > acpiconf: request sleep type (2) failed: Operation not supported > > > [root@backup ~]# acpiconf -s 1 > > > acpiconf: request sleep type (1) failed: Operation not supported > > > > > > I am using generic kernel with little modifications, (generally i have > > > commented many unused drivers - raid, if_....) Acpi is in generic > > > kernel now. > > > > > > I even tried to wake the machine with magic packet after shutdown -h. > > > But still no luck. > > > > > > > > > >> indication on the switch port says that after shut down there is > > > >> active link. > > > >> > > > > > > > That indicates the controller is alive so it shall respond to WOL > > > > if it was correctly configured to receive WOL packets. Have you > > > > tried to send Magic packets to FreeBSD box? > > > > > > >> Here is some information after last update: > > > >> > > > >> [root@backup ~]# uname -a > > > >> FreeBSD backup.pulsar.bg 7.1-PRERELEASE FreeBSD > > > >> 7.1-PRERELEASE #1: Mon Oct 6 17:01:26 EEST 2008 > > > >> root@backup.pulsar.bg:/usr/obj/usr/src/sys/MYCONF amd64 > > > >> > > > >> [root@backup ~]# pciconf -lv > > > >> ... > > > >> re0@pci0:3:0:0: class=0x020000 card=0xe0001458 > > > >> chip=0x816810ec rev=0x01 hdr=0x00 > > > >> vendor = 'Realtek Semiconductor' > > > >> device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC' > > > >> class = network > > > >> subclass = ethernet > > > >> ... > > > > > > > Show me dmesg output pertinent to re(4). > > > > > > re0: port 0xd000-0xd0ff mem 0xf2000000-0xf2000fff irq 17 at device 0.0 on pci3 > > > re0: turning off MSI enable bit. > > > re0: Chip rev. 0x38000000 > > > re0: MAC rev. 0x00000000 > > > miibus0: on re0 > > > rgephy0: PHY 1 on miibus0 > > > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto > > > re0: Ethernet address: 00:1f:d0:24:19:e9 > > > re0: [FILTER] > > > > > > > It looks like your chip is RTL8168B and I don't see any errors in > > WOL related code of re(4). :-( > > Can you check the resolved link speed/duplex of FreeBSD box after > > shutdown?(You can enter to your switch menu and see the port > > status.) > > How about sending WOL packets over direct-connected UTP cable > > without using switch? > > Hello again :) I have resumed my experimets with wol again. I checked the link speed from the switch web gui. It says "100M Full". > Here is WOL patch which may fix the issye. Would you try the > following patch and let me know whether WOL works or not? > http://people.freebsd.org/~yongari/re/re.phy.patch.20081103 > I tried your patch. Just to be sure that I did it correct - here are my steps: cd / patch < /path/re.phy.patch.20081103 cd /sys/modules/re make all install reboot After reboot I shutdown the machine and try to wake it from another computer. Still WOL does not work. From mike at sentex.net Mon Nov 3 06:48:08 2008 From: mike at sentex.net (Mike Tancsa) Date: Mon Nov 3 06:48:16 2008 Subject: fifo log problem Message-ID: <200811031448.mA3Em2Ow024387@lava.sentex.ca> I have been taking a look at the fifolog(1) system in RELENG_7 and I must be missing something obvious. I created a file using default params e.g fifolog_create /var/log/all.fifo and then in /etc/syslog.conf I have *.* /var/log/all.log *.* | /usr/sbin/fifolog_writer /var/log/all.fifo It seems to work for the most part, but there are entries that are missing throughout the log e.g. in the traditional all.log I have # wc all.log 4833 55212 398099 all.log yet the fifo log file I have # fifolog_reader all.fifo | wc >From 0 Wed Dec 31 19:00:00 1969 To 1225722724 Mon Nov 3 09:32:04 2008 Read from 0 223 2783 23271 There does not seem to be any pattern as to what it discards / keeps ---Mike -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike From swhetzel at gmail.com Mon Nov 3 07:31:43 2008 From: swhetzel at gmail.com (Scot Hetzel) Date: Mon Nov 3 07:31:51 2008 Subject: installworld chflags failures In-Reply-To: <490E9E2F.2010403@psg.com> References: <490E9E2F.2010403@psg.com> Message-ID: <790a9fff0811030731g1638ea48q96dd1783a4a9ce59@mail.gmail.com> On 11/3/08, Randy Bush wrote: > i386, fresh cvsup > > FreeBSD 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #14: Sun Nov 2 12:13:46 > GMT 2008 root@psg.com:/usr/obj/usr/src/sys/PSG i386 > > single luser mode over serial console > > :/usr/src# time make installworld 2>&1 > installworld.log > install: /usr/lib/libkse.so.3: chflags: Operation not supported > install: /usr/lib/librt.so.1: chflags: Operation not supported > chflags: /usr/bin/chpass: Operation not supported > install: /usr/bin/login: chflags: Operation not supported > install: /usr/bin/opieinfo: chflags: Operation not supported > install: /usr/bin/opiepasswd: chflags: Operation not supported > chflags: /usr/bin/passwd: Operation not supported > install: /usr/bin/rlogin: chflags: Operation not supported > install: /usr/bin/rsh: chflags: Operation not supported > install: /usr/bin/su: chflags: Operation not supported > install: /usr/bin/crontab: chflags: Operation not supported > install: /usr/sbin/sliplogin: chflags: Operation not supported > > this is new and different, and i am worried. no clue in UPDATING. no > clue in head. > ZFS currently doesn't support chflags(2): FreeBSD hp010.hetzel.org 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Mon Oct 27 22:02:16 CDT 2008 root@hp010.hetzel.org:/usr/obj/usr/src/8x-zfs/sys/DV8135NR amd64 hp010# touch t ; chflags arch t chflags: t: Operation not supported hp010# touch t ; chflags opaque t chflags: t: Operation not supported hp010# touch t ; chflags nodump t chflags: t: Operation not supported hp010# touch t ; chflags sappnd t chflags: t: Operation not supported hp010# touch t ; chflags schg t chflags: t: Operation not supported hp010# touch t ; chflags sunlnk t chflags: t: Operation not supported hp010# touch t ; chflags uappnd t chflags: t: Operation not supported hp010# touch t ; chflags uchg t chflags: t: Operation not supported hp010# touch t ; chflags uunlnk t chflags: t: Operation not supported hp010# rm t Even though http://wiki.freebsd.org/ZFS tells says that chflags(2) support is done. (see http://perforce.freebsd.org/changeView.cgi?CH=147105), it hasn't been merged from perforce to -CURRENT. Scot From mike at sentex.net Mon Nov 3 07:47:34 2008 From: mike at sentex.net (Mike Tancsa) Date: Mon Nov 3 07:47:40 2008 Subject: fifo log problem In-Reply-To: <43295.1225726136@critter.freebsd.dk> References: <43295.1225726136@critter.freebsd.dk> Message-ID: <200811031547.mA3FlVVs024666@lava.sentex.ca> At 10:28 AM 11/3/2008, Poul-Henning Kamp wrote: >In message <200811031448.mA3Em2Ow024387@lava.sentex.ca>, Mike Tancsa writes: > >I have been taking a look at the fifolog(1) system in RELENG_7 and I > >must be missing something obvious. I created a file using default params > >e.g > > > >fifolog_create /var/log/all.fifo > >and then in /etc/syslog.conf I have > >*.* /var/log/all.log > >*.* | /usr/sbin/fifolog_writer /var/log/all.fifo > > > >It seems to work for the most part, but there are entries that are > >missing throughout the log > > > >e.g. in the traditional all.log I have > ># wc all.log > > 4833 55212 398099 all.log > > > >yet the fifo log file I have > > > ># fifolog_reader all.fifo | wc > >>From 0 Wed Dec 31 19:00:00 1969 > >To 1225722724 Mon Nov 3 09:32:04 2008 > >Read from 0 > > 223 2783 23271 > > > >There does not seem to be any pattern as to what it discards / keeps > >Try using "cat" instead of fifolog_writer, so we can tell on which >side of the pipe we are looking for the trouble ? Hi, Seems to work fine with cat *.* /var/log/all.log *.* | /usr/sbin/fifolog_writer /var/log/all.fifo *.* | cat > /var/log/all.cat ---Mike >-- >Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 >phk@FreeBSD.ORG | TCP/IP since RFC 956 >FreeBSD committer | BSD since 4.3-tahoe >Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Mon Nov 3 07:58:02 2008 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Mon Nov 3 07:58:09 2008 Subject: fifo log problem In-Reply-To: Your message of "Mon, 03 Nov 2008 09:48:08 EST." <200811031448.mA3Em2Ow024387@lava.sentex.ca> Message-ID: <43295.1225726136@critter.freebsd.dk> In message <200811031448.mA3Em2Ow024387@lava.sentex.ca>, Mike Tancsa writes: >I have been taking a look at the fifolog(1) system in RELENG_7 and I >must be missing something obvious. I created a file using default params >e.g > >fifolog_create /var/log/all.fifo >and then in /etc/syslog.conf I have >*.* /var/log/all.log >*.* | /usr/sbin/fifolog_writer /var/log/all.fifo > >It seems to work for the most part, but there are entries that are >missing throughout the log > >e.g. in the traditional all.log I have ># wc all.log > 4833 55212 398099 all.log > >yet the fifo log file I have > ># fifolog_reader all.fifo | wc >>From 0 Wed Dec 31 19:00:00 1969 >To 1225722724 Mon Nov 3 09:32:04 2008 >Read from 0 > 223 2783 23271 > >There does not seem to be any pattern as to what it discards / keeps Try using "cat" instead of fifolog_writer, so we can tell on which side of the pipe we are looking for the trouble ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Mon Nov 3 08:34:09 2008 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Mon Nov 3 08:34:16 2008 Subject: fifo log problem In-Reply-To: Your message of "Mon, 03 Nov 2008 10:47:39 EST." <200811031547.mA3FlVVs024666@lava.sentex.ca> Message-ID: <43507.1225730046@critter.freebsd.dk> In message <200811031547.mA3FlVVs024666@lava.sentex.ca>, Mike Tancsa writes: >Seems to work fine with cat Ok, and the loss is not from one end, it is random records in the middle ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From mike at sentex.net Mon Nov 3 08:47:56 2008 From: mike at sentex.net (Mike Tancsa) Date: Mon Nov 3 08:48:02 2008 Subject: fifo log problem In-Reply-To: <43507.1225730046@critter.freebsd.dk> References: <43507.1225730046@critter.freebsd.dk> Message-ID: <200811031647.mA3GlruN025099@lava.sentex.ca> At 11:34 AM 11/3/2008, Poul-Henning Kamp wrote: >In message <200811031547.mA3FlVVs024666@lava.sentex.ca>, Mike Tancsa writes: > > >Seems to work fine with cat > >Ok, and the loss is not from one end, it is random records in >the middle ? Yes, they seem to initially get written and then tail off for some reason. I am not sure why. Actually, if I SIGHUP syslogd, it seems to make a difference, in that I can generally see when newsyslog sig HUPs syslog to do log rotation. Perhaps this is confusing things ? e.g. 1225628270 Nov 2 07:17:50 st32278 ovpn-kit[1047]: Initialization Sequence Completed 1225641602 Nov 2 11:00:02 st32278 syslogd: restart 1225641608 Nov 2 11:00:08 st32278 ppp[927]: tun0: Chat: deflink: Redial timer expired. In this snippet the last entry was 07:17 for some reason and then the SIGHUP from newsyslog seems to wake things up for some reason. ---Mike >-- >Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 >phk@FreeBSD.ORG | TCP/IP since RFC 956 >FreeBSD committer | BSD since 4.3-tahoe >Never attribute to malice what can adequately be explained by incompetence. From kensmith at cse.Buffalo.EDU Mon Nov 3 08:57:09 2008 From: kensmith at cse.Buffalo.EDU (Ken Smith) Date: Mon Nov 3 08:57:16 2008 Subject: FreeBSD 6.4-RC2 available... Message-ID: <1225731426.42716.42.camel@bauer.cse.buffalo.edu> The second Release Candidate for FreeBSD 6.4 is now available. FreeBSD 6.4-RC2 should be the last of the public test builds for the FreeBSD 6.4 release cycle. Unless a big show-stopper is found from this round of testing we should begin the 6.4-RELEASE builds in about a week and a half. We encourage you to test out 6.4-RC2 and report any problems by submitting PRs or via email to the freebsd-stable list. One of the things that could use some testing is the late-arriving patches to sysinstall that should eliminate the excessive disc swapping that previous releases had if things like gnome or kde were installed as part of the initial CDROM install. Things selected before you reach the "Package Selection Menu" are handled as separate passes from the things that get selected at that menu. So depending on what you install before you reach that menu (e.g. Xorg selected as part of the distributions) and what you select at that menu there may still be some disc swapping involved. But "worst case" given the current layout it should ask for any given disc no more than twice. For amd64 and i386 DVD images are also available. The DVD images include the install bits, livefs, docs, and the same set of packages that are available on the CDROMs all in one image. So, you can choose to download/burn the DVD image to a DVD, or you can choose to download/burn the three CDROM-sized discs and use those. The ISO images and FTP install trees are available on the FreeBSD Mirror sites. Using the primary site as an example: ftp://ftp.freebsd.org/pub/FreeBSD/releases/${arch}/ISO-IMAGES/6.4/ where ${arch} is one of alpha, amd64, i386, pc98, or sparc64. Checksums for the ISO images are at the bottom of this message. The amd64 and i386 sets include what should be the final selection of packages that will be included with the release. If you would like to do a source-based update to 6.4-RC2 from an already installed machine you can update your tree to RELENG_6_4 using normal cvsup/csup methods. Note that as a somewhat inconvenient side-effect of the primary FreeBSD source repository now being in SVN the creation of the RELENG_6_4 branch in the CVS repository wound up checking in a "new" version of every file, in some cases only changing the FBSDID. That will probably make mergemaster a bit tedious. Sorry for the inconvenience. The freebsd-update(8) utility supports binary upgrades of i386 and amd64 systems running earlier FreeBSD releases. Systems running 6.3-RELEASE, 6.4-BETA, or 6.4-RC1 can upgrade as follows: # freebsd-update upgrade -r 6.4-RC2 During this process, FreeBSD Update may ask the user to help by merging some configuration files or by confirming that the automatically performed merging was done correctly. # freebsd-update install The system must be rebooted with the newly installed kernel before continuing. # shutdown -r now After rebooting, freebsd-update neews to be run again to install the new userland components, and the system needs to be rebooted again: # freebsd-update install # shutdown -r now Note that FreeBSD Update stores downloaded upgrades in /var/db/freebsd-update, so at least 400MB should be free in /var before running freebsd-update; if the /var partition is too small, the -d option to freebsd-update can be used to indicate that the upgrades should be stored in a different directory. Checksums: MD5 (6.4-RC2-alpha-bootonly.iso) = 0c1a10fcb84e4bcc7efbd7c843726f34 MD5 (6.4-RC2-alpha-disc1.iso) = af9ec02034b7d9833f1af9d2fafbfa38 MD5 (6.4-RC2-alpha-disc2.iso) = 80d7e1cf89be3f88af1b99fbbc48e27f MD5 (6.4-RC2-alpha-disc3.iso) = 693e09dafa19e995303c02970e13acb3 MD5 (6.4-RC2-alpha-docs.iso) = 671a1f48159b63df0decc011d91f9772 MD5 (6.4-RC2-amd64-bootonly.iso) = 67da9f580f8b33762c07c2a85a621534 MD5 (6.4-RC2-amd64-disc1.iso) = ffab4610c8fb807496b289237000aa93 MD5 (6.4-RC2-amd64-disc2.iso) = d1496e78dbb60d4e9210191fd3d57e76 MD5 (6.4-RC2-amd64-disc3.iso) = 708a0d328be5bfd3deb2ac35986aadee MD5 (6.4-RC2-amd64-docs.iso) = 08588036702646adbe81f78c3bcafdaa MD5 (6.4-RC2-amd64-dvd1.iso) = 2c2dcc94097aaeec1c7a50101236e2bf MD5 (6.4-RC2-i386-bootonly.iso) = 7ed7b049d14eb170c2d4c044312bccef MD5 (6.4-RC2-i386-disc1.iso) = d114cdad5502bff3e8182ddd42d81ab3 MD5 (6.4-RC2-i386-disc2.iso) = c0cbbfcb2a2f2d81974a33c85e242155 MD5 (6.4-RC2-i386-disc3.iso) = cbff7482228883da14f1f85070a89422 MD5 (6.4-RC2-i386-docs.iso) = 4663810144153a21aad6c81768af7358 MD5 (6.4-RC2-i386-dvd1.iso) = 1d3608538d8476c8df64d8e91a03aa22 MD5 (6.4-RC2-pc98-bootonly.iso) = a059b9178c0d985ed729a2d33a1d1577 MD5 (6.4-RC2-pc98-disc1.iso) = b1508c34d2f595ea44c446be1e989c57 MD5 (6.4-RC2-sparc64-bootonly.iso) = 1311e40fb9ccc786994098444d1e18f0 MD5 (6.4-RC2-sparc64-disc1.iso) = 45b629d802f16a73f13e77e70c2f0bfc MD5 (6.4-RC2-sparc64-disc2.iso) = ddd1d1ad89cf51eac2e5fc2acdfadadc MD5 (6.4-RC2-sparc64-disc3.iso) = a348d7a193d5f07c8ff11ac8f0d5e478 MD5 (6.4-RC2-sparc64-docs.iso) = 4a3f9d15c149d98096fb9224a1b338c6 SHA256 (6.4-RC2-alpha-bootonly.iso) = c1b4aff4134572bd8c0d8c743730fc026f45fccb4a3b15b88d2c706913dd5815 SHA256 (6.4-RC2-alpha-disc1.iso) = dbb29cfd589fb60faf881c94359d68354f9b9e302eb6660100e6ae8e5defd228 SHA256 (6.4-RC2-alpha-disc2.iso) = dd9c8680def3d883454ebcb4ec27ebef294b305641651fb32b283694be256404 SHA256 (6.4-RC2-alpha-disc3.iso) = 09c62551c83c1e15c943c684ac7f2abc3aae7d2038a8678ad65d7378f4180637 SHA256 (6.4-RC2-alpha-docs.iso) = 20164ab985f2969da8279476a4f50dc7390194634a3a077a0bafa14ab8d308d4 SHA256 (6.4-RC2-amd64-bootonly.iso) = 7a9942711c78216123c6579777db2270b18519c35ef4961ef8127ed690b1008f SHA256 (6.4-RC2-amd64-disc1.iso) = fdbb5975f2319c7bc27d6897cb98d20777e9f99203341e46a54da36618ff5c7a SHA256 (6.4-RC2-amd64-disc2.iso) = b1cf88187df4293b5b1c952d9f2f8ec4ed2d9bb1d1ab79c284686593389b5802 SHA256 (6.4-RC2-amd64-disc3.iso) = 451b67b617e1112ce2678b7de6d805fef778b389fc2cf474a4ca7ec8f9139b0a SHA256 (6.4-RC2-amd64-docs.iso) = 2d7e71caa30a3a5de987a9d504eb6357124c844c5c596700e23d80dcc9cff06d SHA256 (6.4-RC2-amd64-dvd1.iso) = d7c8600752d57c1787b3ac74a50e4ce110648fc441f4210bf4569f3e6caef3ce SHA256 (6.4-RC2-i386-bootonly.iso) = 2167b136f46b77cc12b75914c388d8f15609b997fe17f5bf0beaadfc7ad92fa2 SHA256 (6.4-RC2-i386-disc1.iso) = abebefde3cdddfd234f04c632ee09162f12cfafda4bab655e18727d5d50a9616 SHA256 (6.4-RC2-i386-disc2.iso) = 446275afd27dc256bec7857a7455dc8add04fe9f65dfc6f8040cfe68d9633f63 SHA256 (6.4-RC2-i386-disc3.iso) = ddb155aa3961dae9f5ec1d694eed3eec81ad069105e539040fc00c5ce3956986 SHA256 (6.4-RC2-i386-docs.iso) = dd1fb03fbe8f1c190666373fa3bc14200aa8b2558ac4220d4568a588c2bb19ae SHA256 (6.4-RC2-i386-dvd1.iso) = f95d924ed4f03caf40ff9c11cf7b06fcb69752a2aeda25d34a3bf78adea689fc SHA256 (6.4-RC2-pc98-bootonly.iso) = 9087809da5db4ef92ba120d2f0fa0167ca54d70e59bc49289c6fc49529e9bdb3 SHA256 (6.4-RC2-pc98-disc1.iso) = f0243d19cba1db9198c7d299b77025f1ffba0fdc95a5cab24346e90129a3b6ae SHA256 (6.4-RC2-sparc64-bootonly.iso) = f1613840cb3190125952dd4ede0388a6add03b31dce5e092d0b7c30d9c439e67 SHA256 (6.4-RC2-sparc64-disc1.iso) = 32281bfa09ccf02fe7fa5e82f8b92951f388ef40f5d77ef25603485675f9c75c SHA256 (6.4-RC2-sparc64-disc2.iso) = 20b965b9301eff26470248e13e3f946cc351e50871c45031622f37c0efd9317b SHA256 (6.4-RC2-sparc64-disc3.iso) = b5c7fd70ba2fcd7e16effd9e9c7fd645b74bbd899249d392c78321990e84097b SHA256 (6.4-RC2-sparc64-docs.iso) = 139c3219f6ed001cda98199456122f12852ead59b0e4cc739d1dcb50b62e7c3e -- Ken Smith - From there to here, from here to | kensmith@cse.buffalo.edu there, funny things are everywhere. | - Theodore Geisel | -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081103/b0d7fc4a/attachment.pgp From m4gicite at gmail.com Mon Nov 3 10:06:48 2008 From: m4gicite at gmail.com (Ryan) Date: Mon Nov 3 10:06:57 2008 Subject: Install issues with 7.x In-Reply-To: References: Message-ID: Thanks for your interest Robert, unfortunately it was a no-go. I went ahead and tested 6.4RC1 and 6.3. Now all I am getting are ACPI errors which I would think I could get the installer going by disabling ACPI. And yes, I'm running on the latest - and only - bios revision for the laptop. The following were done under default boot option. No ACPI did not generate any error messages and hung, single user mode acted the same as default, and safe mode created an interrupt storm like 7.x did. 6.4-RC1 ... acpi_timer0: <24-bit timer at 3.579545 MHz> port 0x408-0x40b on acpi0 acpi_ec0: port 0x62, 0x66 on acpi0 cpu0: on acpi0 ACPI-0328: *** Error: No pointer back to NS node in buffer obj 0xc85c87c0 ACPI-1304: *** Error: Method execution failed [\_PR_.CPU0._OSC] (Node 0xc84e2580), AE_AML_INTERNAL acpi_throttle0: on cpu0 ... then this gets spammed 6 times at the end ACPI-0438: *** Error: Looking up [\_PR_.CPU0._PPC] in name space, AE_NOT_FOUND SearchNode 0xc84cdcc0 StartNode 0xc84cdcc0 ReturnNode 0 ACPI-1304: *** Error: Method execution failed [\_SB_.AC__.ADJP] (Node 0xc84cdcc0), AE_NOT_FOUND ACPI-1304: *** Error: Method execution failed [\_SB.AC__._PSR] (Node 0xc84cdd00), AE_NOT_FOUND 6.3-REL 6.3 gives the same errors but with different node addresses. ... acpi_timer0: <24-bit timer at 3.579545 MHz> port 0x408-0x40b on acpi0 acpi_ec0: port 0x62, 0x66 on acpi0 cpu0: on acpi0 ACPI-0328: *** Error: No pointer back to NS node in buffer obj 0xc85c94c0 ACPI-1304: *** Error: Method execution failed [\_PR_.CPU0._OSC] (Node 0xc84e3780), AE_AML_INTERNAL acpi_throttle0: on cpu0 ... spammed again 6 times at the end ACPI-0438: *** Error: Looking up [\_PR_.CPU0._PPC] in name space, AE_NOT_FOUND SearchNode 0xc84cdd20 StartNode 0xc84cdd20 ReturnNode 0 ACPI-1304: *** Error: Method execution failed [\_SB_.AC__.ADJP] (Node 0xc84cdd20), AE_NOT_FOUND ACPI-1304: *** Error: Method execution failed [\_SB.AC__._PSR] (Node 0xc84e3780), AE_NOT_FOUND Help at all? On Sun, Nov 2, 2008 at 9:50 AM, Robert Watson wrote: > On Wed, 29 Oct 2008, Ryan wrote: > >> Hello, I purchased a new Clevo M860TU on the account that it ran linux >> very well and was hoping it would fair the same on FreeBSD. Not so much, >> little help? I posted this in mobile originally but though stable would be a >> better choice. Don't know if it is more appropriate here or ACPI. >> >> I'm giving you as much information as I know how to get. as I cannot get >> sysinstall to load I am having to type all these dmesg. The boot process is >> hanging. This is all with 7.x, I can give 6.x if needed. > > xpt_config is the CAM configuration wait, so basically the system is waiting > for a storage device to report back on whether it could be used as a root > file system. > > I recently saw a similar report of problems involving a firewire controller > on an nvidia motherboard following an upgrade to 7.x, and I wonder if you > might try the following: see if 6.4 will install, and if so, install it. > Then cvsup 7.x, and do a buildworld but not an installworld. This will let > you build and experiment with 7.x kernels from a known-working environment. > > Make sure to keep a working 6.x kernel around -- I suggest something like > "cp -r /boot/kernel /boot/kernel.good" before starting so you can always > fall back to a good kernel. Now try building a 7.x kernel without USB or > firewire support, and booting that? > > Also, it's worth checking there are no BIOS upgrades available for the > motherboard... > > Robert N M Watson > Computer Laboratory > University of Cambridge > > >> >> Hardware: >> Intel P9500 >> 4gb DDR3-1066 >> Nvidia 9800M GT >> Atheros AR5006e >> >> FreeBSD 7.1-BETA2 >> >> These snippets of dmesg happen around the end where it hangs. >> >> 1. Default >> >> ... >> cpu0: on acpi0 >> ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj >> 0xc6a02d40 [20070320] >> ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving >> operands for [OpcodeName unavailable] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed >> [\_PR_.CPU0._OSC] (Node 0xc68556e0), AE_AML_INTERNAL >> est0: on cpu0 >> p4tcc0: on cpu0 >> cpu1: on acpi0 >> ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj >> 0xc6a0e300 [20070320] >> ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving >> operands for [OpcodeName unavailable] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed >> [\_PR_.CPU1._OSC] (Node 0xc685560), AE_AML_INTERNAL >> est1: on cpu1 >> p4tcc1: on cpu1 >> ... >> cpu0: Cx states changed >> cpu1: Cx states changed >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> acd0: DVDR at ata3-master UDMA33 >> GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install >> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config >> >> Then just stalls >> >> 2. No ACPI >> >> ... >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> acd0: DVDR at ata3-master UDMA33 >> GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install >> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config >> >> Then just stalls >> >> 3. Safe Mode >> >> I can only tell you a little because console is spammed. It is the >> same as no ACPI, but with an interrupt storm. >> >> ... >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> acd0: DVDR at ata3-master UDMA33 >> GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install >> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config >> >> When it gets to the unknowns, this is spammed. >> >> interrupt storm detected on "irq10:"; throttling interrupt source >> >> Other than the interrupt storm spam, it is halted like the others. >> >> >> 4. Single User Mode >> >> Same as 1, Default >> >> >> 5. Verbose >> >> All I can tell you is what is spammed at the end. >> >> acpi: bad write to port 0x080 (32), val hex >> >> Where hex is ever increasing and loops when it hits 0xff01. I can also >> see run_interrupt_driven_hooks message in all the spam. >> >> Using some googling if you add the sysctl before boot >> >> debug.acpi.block_bad_io=1 >> >> it might be of some help. This just leads to a never ending loop of >> acpi errors - the scroll very fast and difficult to record might I >> add! >> >> ... >> acpi: bad write to port 0x080 (32), val hex >> ACPI Exception (evregion-0529): AE_BAD_PARAMETER, Returned by handler >> for [SystemIO] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed [\P8XH] (Node >> 0xc6850a60), AE_BAD_PARAMETER >> ACPI Error (psparse-0626): Method parse/execution failed [\_GPE._L01] >> [20070320] >> ACPI Exception (evgpe-0687): AE_BAD_PARAMETER, while evauating GPE >> method [_L01] [20070320] >> --repeat-- >> ... >> >> >> FreeBSD 7.0-REL >> >> 7.0 is a little different than 7.1. Messages are somewhat the same but >> they happen near the beginning of dmesg instead of around the end. The >> run_interrupt_driven_hooks issue is nonexistant as well, but it still >> hangs. I'm guessing that's a debug tool more than an error. >> >> 1. Default >> >> ... >> cpu0: on acpi0 >> ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj >> 0xc6862580 [20070320] >> ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving >> operands for [OpcodeName unavailable] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed >> [\_PR_.CPU0._OSC] (Node 0xc682d580), AE_AML_INTERNAL >> est0: on cpu0 >> p4tcc0: on cpu0 >> cpu1: on acpi0 >> ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj >> 0xc6861100 [20070320] >> ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving >> operands for [OpcodeName unavailable] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed >> [\_PR_.CPU1._OSC] (Node 0xc682d4a0), AE_AML_INTERNAL >> est1: on cpu1 >> p4tcc1: on cpu1 >> ... >> cpu0: Cx states changed >> cpu1: Cx states changed >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> acd0: DVDR at ata3-master UDMA33 >> GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install >> >> Hangs. >> >> 2. No ACPI >> >> .. >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> .. >> >> Hangs. >> >> >> 3. Safe Mode >> >> Same interrupt storm as 7.1-BETA2. >> >> ... >> interrupt storm detected on "irq10:"; throttling interrupt source >> --repeat-- >> >> 4. Single User Mode >> >> Same as 1. Default. >> >> >> 5. Verbose >> >> Hang like normal, cannot see the ACPI errors since they fly off the >> scroll lock buffer. >> >> ... >> cpu0: Cx states changed >> cpu1: Cx states changed >> ... >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> ... >> >> >> Thanks again. >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> > From joel at FreeBSD.org Mon Nov 3 11:00:03 2008 From: joel at FreeBSD.org (Joel Dahl) Date: Mon Nov 3 11:00:09 2008 Subject: Install issues with 7.x In-Reply-To: References: Message-ID: <490F40CB.8040605@FreeBSD.org> Ryan skrev: > Hello, I purchased a new Clevo M860TU on the account that it ran linux > very well and was hoping it would fair the same on FreeBSD. Not so > much, little help? I posted this in mobile originally but though > stable would be a better choice. Don't know if it is more appropriate > here or ACPI. > > I'm giving you as much information as I know how to get. as I cannot > get sysinstall to load I am having to type all these dmesg. The boot > process is hanging. This is all with 7.x, I can give 6.x if needed. > > Hardware: > Intel P9500 > 4gb DDR3-1066 > Nvidia 9800M GT > Atheros AR5006e > > FreeBSD 7.1-BETA2 > > These snippets of dmesg happen around the end where it hangs. > > 1. Default > > ... > cpu0: on acpi0 > ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj > 0xc6a02d40 [20070320] > ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving > operands for [OpcodeName unavailable] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed > [\_PR_.CPU0._OSC] (Node 0xc68556e0), AE_AML_INTERNAL > est0: on cpu0 > p4tcc0: on cpu0 > cpu1: on acpi0 > ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj > 0xc6a0e300 [20070320] > ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving > operands for [OpcodeName unavailable] [20070320] > ACPI Error (psparse-0626): Method parse/execution failed > [\_PR_.CPU1._OSC] (Node 0xc685560), AE_AML_INTERNAL > est1: on cpu1 > p4tcc1: on cpu1 > ... > cpu0: Cx states changed > cpu1: Cx states changed > unknown: timeout waiting for read DRQ > unknown: timeout waiting for read DRQ > acd0: DVDR at ata3-master UDMA33 > GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config > run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config Disabling firewire completely in BIOS might at least get the machine booting. You should try that if you haven't already. I've seen this problem on at least two different systems... -- Joel From jespasac at minibofh.org Mon Nov 3 13:54:30 2008 From: jespasac at minibofh.org (Jordi Espasa Clofent) Date: Mon Nov 3 13:54:37 2008 Subject: Replication system Message-ID: <490F6EBF.5000102@minibofh.org> Hi all, I have to build a clustered website with FreeBSD 7.x as SO and Apache 2.x as httpd. As load-balancing solution I'll use HAProxy (or maybe a OpenBSD relayd, I'm not sure). Because of several technical (and especially non-technical) reasons, I haven't the possibility to mount a shared storage layer (NFS, SAN...) so I have to share the local data among the different httpd servers. At first approach I've thought in rsync+cron, but ?anyone knows another replication-data solution in the described scenario? PD. Please, don't advice to me to using a pure shared-data layer... I know it will be the optimal structure, but as I've said above, I can't use it because various reasons. -- Thanks, Jordi Espasa Clofent From m4gicite at gmail.com Mon Nov 3 14:34:31 2008 From: m4gicite at gmail.com (Ryan) Date: Mon Nov 3 14:42:13 2008 Subject: Install issues with 7.x In-Reply-To: <490F40CB.8040605@FreeBSD.org> References: <490F40CB.8040605@FreeBSD.org> Message-ID: Sadly with the quality of BIOS recently, that is not an option. Not much to offer. Attached is a picture of what I have to change. Other and XP are the same, Vista unlocks AHCI. Another way of accomplishing disabling firewire is to remake the install CD with a different kernel and not quite sure how to do that. On Mon, Nov 3, 2008 at 6:19 PM, Joel Dahl wrote: > Ryan skrev: >> >> Hello, I purchased a new Clevo M860TU on the account that it ran linux >> very well and was hoping it would fair the same on FreeBSD. Not so >> much, little help? I posted this in mobile originally but though >> stable would be a better choice. Don't know if it is more appropriate >> here or ACPI. >> >> I'm giving you as much information as I know how to get. as I cannot >> get sysinstall to load I am having to type all these dmesg. The boot >> process is hanging. This is all with 7.x, I can give 6.x if needed. >> >> Hardware: >> Intel P9500 >> 4gb DDR3-1066 >> Nvidia 9800M GT >> Atheros AR5006e >> >> FreeBSD 7.1-BETA2 >> >> These snippets of dmesg happen around the end where it hangs. >> >> 1. Default >> >> ... >> cpu0: on acpi0 >> ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj >> 0xc6a02d40 [20070320] >> ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving >> operands for [OpcodeName unavailable] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed >> [\_PR_.CPU0._OSC] (Node 0xc68556e0), AE_AML_INTERNAL >> est0: on cpu0 >> p4tcc0: on cpu0 >> cpu1: on acpi0 >> ACPI Error (dsopcode-0350): No pointer back to NS node in buffer obj >> 0xc6a0e300 [20070320] >> ACPI Exception (dswexec-0556): AE_AML_INTERNAL, While resolving >> operands for [OpcodeName unavailable] [20070320] >> ACPI Error (psparse-0626): Method parse/execution failed >> [\_PR_.CPU1._OSC] (Node 0xc685560), AE_AML_INTERNAL >> est1: on cpu1 >> p4tcc1: on cpu1 >> ... >> cpu0: Cx states changed >> cpu1: Cx states changed >> unknown: timeout waiting for read DRQ >> unknown: timeout waiting for read DRQ >> acd0: DVDR at ata3-master UDMA33 >> GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install >> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config >> run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config > > Disabling firewire completely in BIOS might at least get the machine > booting. You should try that if you haven't already. I've seen this > problem on at least two different systems... > > -- > Joel > From koitsu at FreeBSD.org Mon Nov 3 15:27:04 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Mon Nov 3 15:27:10 2008 Subject: Replication system In-Reply-To: <490F6EBF.5000102@minibofh.org> References: <490F6EBF.5000102@minibofh.org> Message-ID: <20081103232202.GA32276@icarus.home.lan> On Mon, Nov 03, 2008 at 10:35:59PM +0100, Jordi Espasa Clofent wrote: > I have to build a clustered website with FreeBSD 7.x as SO and Apache > 2.x as httpd. As load-balancing solution I'll use HAProxy (or maybe a > OpenBSD relayd, I'm not sure). > > Because of several technical (and especially non-technical) reasons, I > haven't the possibility to mount a shared storage layer (NFS, SAN...) so > I have to share the local data among the different httpd servers. > > At first approach I've thought in rsync+cron, but > > ?anyone knows another replication-data solution in the described scenario? > > PD. Please, don't advice to me to using a pure shared-data layer... I > know it will be the optimal structure, but as I've said above, I can't > use it because various reasons. Try ggatec(8) and ggated(8). They perform replication at the filesystem level, over the network. I do not have experience using them. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From mgrant at grant.org Mon Nov 3 15:39:48 2008 From: mgrant at grant.org (Michael Grant) Date: Mon Nov 3 15:39:55 2008 Subject: Replication system In-Reply-To: <490F6EBF.5000102@minibofh.org> References: <490F6EBF.5000102@minibofh.org> Message-ID: <62b856460811031507n2bbfdc14j3bfa7b6006208ff0@mail.gmail.com> On Mon, Nov 3, 2008 at 10:35 PM, Jordi Espasa Clofent wrote: > Hi all, > > I have to build a clustered website with FreeBSD 7.x as SO and Apache 2.x as > httpd. As load-balancing solution I'll use HAProxy (or maybe a OpenBSD > relayd, I'm not sure). > > Because of several technical (and especially non-technical) reasons, I > haven't the possibility to mount a shared storage layer (NFS, SAN...) so I > have to share the local data among the different httpd servers. > > At first approach I've thought in rsync+cron, but > > ?anyone knows another replication-data solution in the described scenario? > > PD. Please, don't advice to me to using a pure shared-data layer... I know > it will be the optimal structure, but as I've said above, I can't use it > because various reasons. > > -- > Thanks, > Jordi Espasa Clofent GlusterFS http://www.gluster.org seems promising. It is a replication layer that sits on top of FUSE (Filesystem in Userspace http://fuse.sf.net). You can replicate pretty much any type of file system, ufs, zfs, dos...etc. In other words, you don't need to reformat your disk or create some special underlying file system. GlusterFS is in userspace. However.... I have yet to get it working on freebsd. Anyone had any luck with GlusterFS on Freebsd 6.x? Michael Grant From ari at ish.com.au Mon Nov 3 16:00:48 2008 From: ari at ish.com.au (Aristedes Maniatis) Date: Mon Nov 3 16:00:56 2008 Subject: Replication system In-Reply-To: <490F6EBF.5000102@minibofh.org> References: <490F6EBF.5000102@minibofh.org> Message-ID: <9335DDBE-F6DC-4433-A47D-D18266BA49CF@ish.com.au> On 04/11/2008, at 8:35 AM, Jordi Espasa Clofent wrote: > At first approach I've thought in rsync+cron, but unison [1] works really well for us. In some ways it is better than some sort of shared SAN type solution since there is no single point of failure at the SAN or link to the SAN. Unison is just two way rsync so that changes can propagate in both directions between servers. Ari [1] http://www.cis.upenn.edu/~bcpierce/unison/ --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A From mike at sentex.net Mon Nov 3 17:43:49 2008 From: mike at sentex.net (Mike Tancsa) Date: Mon Nov 3 17:43:56 2008 Subject: fifo log problem In-Reply-To: <7.1.0.9.0.20081103113557.167702f0@sentex.net> References: <43507.1225730046@critter.freebsd.dk> <7.1.0.9.0.20081103113557.167702f0@sentex.net> Message-ID: <200811040143.mA41hjaa029665@lava.sentex.ca> At 11:48 AM 11/3/2008, Mike Tancsa wrote: >At 11:34 AM 11/3/2008, Poul-Henning Kamp wrote: >>In message <200811031547.mA3FlVVs024666@lava.sentex.ca>, Mike Tancsa writes: >> >> >Seems to work fine with cat >> >>Ok, and the loss is not from one end, it is random records in >>the middle ? > > >Yes, they seem to initially get written and then tail off for some >reason. I am not sure why. Actually, if I SIGHUP syslogd, it seems >to make a difference, in that I can generally see when newsyslog sig >HUPs syslog to do log rotation. Perhaps this is confusing things ? I tried changing the config so that there is only the fifo log being written to and disabled newsyslog so that syslogd is not getting a HUP signal. The strange thing is that reading from it gives different results?!? Sometimes doing [ps0278]# fifolog_reader all.fifo | wc >From 0 Wed Dec 31 19:00:00 1969 To 1225760679 Mon Nov 3 20:04:39 2008 Read from 1d800 59 413 3068 0[ps0278]# and a exactly for 1min it will show the correct results 0[ps0278]# fifolog_reader all.fifo | wc >From 0 Wed Dec 31 19:00:00 1969 To 1225760538 Mon Nov 3 20:02:18 2008 Read from 0 10765 75995 556816 0[ps0278]# and then go back to showing just a subset for 4 min. I am guessing this coincides with when the flush runs This is a nanobsd image, so /var on /dev/md1 and RELENG_7 from a few days ago I have been running #!/bin/sh i=0 while true do i=`expr $i + 1` logger $i echo $i sleep 1 done and they seem to be there when it shows all the results, but for the most part it just shows a subset ---Mike From k at kevinkevin.com Mon Nov 3 20:56:20 2008 From: k at kevinkevin.com (Kevin) Date: Mon Nov 3 20:56:26 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD Message-ID: <000001c93e35$9d456790$d7d036b0$@com> Hello, I am trying to install FreeBSD on my HP DL360 G3 server with 8GB of ram. I need to use the AMD64 distribution, as I understand, to utilize over 4GB of ram. Unfortunately I cannot boot from any of the FreeBSD CDs : 7.0-RELEASE-AMD64 6.3-RELEASE-AMD64 I will try 8.0-CURRENT-AMD64, however I don't think it will matter. Since I'm not using "unusual" hardware, but rather hardware that has been established at least since 2003, I was hoping that someone else may have encountered this problem. If anyone can assist me , it would be greatly appreciated. The specific error message is as follows : int=0000000d err=00000000 efl=00010006 eip=000219b2 eax=000219ac ebx=00000000 ecx=c0000080 edx=0006d948 esi=0003e007 edi=00000000 ebp=000940bc esp=0009e088 cs=0008 ds=0010 fs=0010 gs=0010 ss=0010 cs:eip=0f 32 0d 00 01 00 00 0f-30 0f 20 e0 83 c8 30 0f 22 e0 b8 00 c0 03 00 0f-22 d8 0f 20 c0 0d 00 00 Ss:eso=90 95 00 00 00 80 fc 00-00 90 fc 00 07 e0 03 00 00 00 00 00 07 d0 03 00-00 00 00 00 3c d9 06 00 BTX halted The above happens /JUST/ after pressing "enter" at the boot screen when booting from the FreeBSD cd. Thanks, Kevin K. From koitsu at FreeBSD.org Mon Nov 3 20:58:12 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Mon Nov 3 20:58:19 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <000001c93e35$9d456790$d7d036b0$@com> References: <000001c93e35$9d456790$d7d036b0$@com> Message-ID: <20081104045809.GA38528@icarus.home.lan> On Mon, Nov 03, 2008 at 11:27:13PM -0500, Kevin wrote: > Hello, > > I am trying to install FreeBSD on my HP DL360 G3 server with 8GB of ram. I > need to use the AMD64 distribution, as I understand, to utilize over 4GB of > ram. > > Unfortunately I cannot boot from any of the FreeBSD CDs : > > > 7.0-RELEASE-AMD64 > 6.3-RELEASE-AMD64 > > > I will try 8.0-CURRENT-AMD64, however I don't think it will matter. Since > I'm not using "unusual" hardware, but rather hardware that has been > established at least since 2003, I was hoping that someone else may have > encountered this problem. > > If anyone can assist me , it would be greatly appreciated. Can you please try 7.1-PRERELEASE or 6.4-RC2 (just announced today)? There have been bootstrap-related changes since 7.0-RELEASE which may fix your problem. > The specific error message is as follows : > > > int=0000000d err=00000000 efl=00010006 eip=000219b2 > eax=000219ac ebx=00000000 ecx=c0000080 edx=0006d948 > esi=0003e007 edi=00000000 ebp=000940bc esp=0009e088 > cs=0008 ds=0010 fs=0010 gs=0010 ss=0010 > cs:eip=0f 32 0d 00 01 00 00 0f-30 0f 20 e0 83 c8 30 0f > 22 e0 b8 00 c0 03 00 0f-22 d8 0f 20 c0 0d 00 00 > Ss:eso=90 95 00 00 00 80 fc 00-00 90 fc 00 07 e0 03 00 > 00 00 00 00 07 d0 03 00-00 00 00 00 3c d9 06 00 > BTX halted > > The above happens /JUST/ after pressing "enter" at the boot screen when > booting from the FreeBSD cd. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From numardbsd at gmail.com Mon Nov 3 22:06:43 2008 From: numardbsd at gmail.com (Norberto Meijome) Date: Mon Nov 3 22:06:50 2008 Subject: Replication system In-Reply-To: <490F6EBF.5000102@minibofh.org> References: <490F6EBF.5000102@minibofh.org> Message-ID: <20081104164230.013e9887@ayiin> On Mon, 03 Nov 2008 22:35:59 +0100 Jordi Espasa Clofent wrote: > Hi all, Hola Jordi, > I have to build a clustered website with FreeBSD 7.x as SO and Apache > 2.x as httpd. As load-balancing solution I'll use HAProxy (or maybe a > OpenBSD relayd, I'm not sure). you may want to look into carp as well if you don't want to have a separate layer of load balancers. > Because of several technical (and especially non-technical) reasons, I > haven't the possibility to mount a shared storage layer (NFS, SAN...) so > I have to share the local data among the different httpd servers. > > At first approach I've thought in rsync+cron, but > > __anyone knows another replication-data solution in the described scenario? rsync / rsyncd is simple and works. But it really depends on how often you'll be publishing to your site, how big are the change sets ( consider publishing to a separate directory via rsync and then doing an atomic rename/move if the change set is too big.) we used to publish from AU to NL to dir1 in server1 , then ssh to server1 and rsync to server2-n in parallel - all from a script of course. That way, the slow link ( AU <-> NL) never got in the way of the publish. B _________________________ {Beto|Norberto|Numard} Meijome He could be a poster child for retroactive birth control. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. From bms at incunabulum.net Tue Nov 4 03:12:49 2008 From: bms at incunabulum.net (Bruce M Simpson) Date: Tue Nov 4 03:12:59 2008 Subject: GCC profiling broken with C++ exceptions on FreeBSD 7.1 Message-ID: <49102E2E.5000502@incunabulum.net> Hi, I just noticed this in testing over last week, it seems to be specific to FreeBSD 7.1: http://bugzilla.xorp.org/bugzilla/show_bug.cgi?id=811 Does anyone have any information we could use to further track this down? Other platforms don't seem to be affected. cheers BMS From freebsd at violetlan.net Tue Nov 4 03:15:14 2008 From: freebsd at violetlan.net (Reinhold) Date: Tue Nov 4 03:15:50 2008 Subject: Atheros AR5141 Message-ID: <53554.217.45.165.129.1225797313.squirrel@www.violetlan.net> Hi I got myself a nice little R52 802.11a/b/g MiniPCI Card for our router. This card is using the Atheros AR5141 chipset. I searched everywhere to find out more info on this and came across a few pages where it said that the AR5141 chipset is supported but when I stick the card in the router and boot FreeBSD 7-STABLE it hangs when loading the Atheros drivers It stops right after this ath0: mem 0xdfff0000-0xdfffffff irq 10 at device 11.0 on pci0 ath0: [ITHREAD] Are there a new version of the ath driver somewhere that does support this chipset? Thanks Reinhold From joel at FreeBSD.org Tue Nov 4 03:56:31 2008 From: joel at FreeBSD.org (Joel Dahl) Date: Tue Nov 4 03:56:38 2008 Subject: Install issues with 7.x In-Reply-To: References: <490F40CB.8040605@FreeBSD.org> Message-ID: <49103869.3080709@FreeBSD.org> Ryan skrev: > Sadly with the quality of BIOS recently, that is not an option. Not > much to offer. Attached is a picture of what I have to change. Other > and XP are the same, Vista unlocks AHCI. > > Another way of accomplishing disabling firewire is to remake the > install CD with a different kernel and not quite sure how to do that. Take a look at the release(7) manpage for information about building your own customized release CD. -- Joel From kjedruczyk at ramfasto.com Tue Nov 4 04:24:33 2008 From: kjedruczyk at ramfasto.com (=?UTF-8?B?S3J6eXN6dG9mIErEmWRydWN6eWs=?=) Date: Tue Nov 4 04:24:41 2008 Subject: PostgreSQL stats collector eats all CPU time Message-ID: <49103BC0.3070605@ramfasto.com> Recently postgresql on our database server started showing some sort of problems: after running for some time stats collector process eats 100% cpu time - exactly as someone reported here: http://groups.google.com/group/pgsql.general/browse_thread/thread/6dfea591d243e987 No solution is provided there though... kernel/libc bug is suggested I'm not sure how relevant it is - problem appeared first time about a day or two after server has been upgraded with additional processor: now it is 2x dual core opteron with 8GB of RAM. For some reason we didn't see this problem back when it was just one dual core opteron with 4GB of RAM. It is amd64 version of freebsd of course... As the person who reported the problem previously on postgresql mailing list showed - the stats collector busy-loops in interrupted poll call - kdump contains output like this: 878 postgres 0.009643 CALL poll(0x7fffffffd4e0,0x1,0x7d0) 878 postgres 0.009671 RET poll -1 errno 4 Interrupted system call 878 postgres 0.009675 CALL poll(0x7fffffffd4e0,0x1,0x7d0) 878 postgres 0.009687 RET poll -1 errno 4 Interrupted system call 878 postgres 0.009691 CALL poll(0x7fffffffd4e0,0x1,0x7d0) 878 postgres 0.009700 RET poll -1 errno 4 Interrupted system call I also grabbed core dump of the postmaster process and the backtrace seems a little weird to me: #0 0x00000008012186cc in poll () from /lib/libc.so.7 [New Thread 0x801601120 (LWP 100209)] [New LWP 54785] (gdb) bt #0 0x00000008012186cc in poll () from /lib/libc.so.7 #1 0x000000080107c85e in poll () from /lib/libthr.so.3 #2 0x0000000000578bd0 in pgstat_start () #3 0x000000000057d2b5 in PostmasterMain () #4 #5 0x0000000801268cdc in select () from /lib/libc.so.7 #6 0x000000080107c574 in select () from /lib/libthr.so.3 #7 0x000000000057aaa3 in ClosePostmasterPorts () #8 0x000000000057be9e in PostmasterMain () #9 0x00000000005358fe in main () If I'm reading it right the constantly interrupted poll function is being called from the signal handler? Any suggestions what else to do to identify the problem? It seems that the situation will be reproducible - after server restart it happened again within one day. -- Best regards, Krzysztof J?druczyk From k at kevinkevin.com Tue Nov 4 04:37:15 2008 From: k at kevinkevin.com (Kevin) Date: Tue Nov 4 04:37:21 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <20081104045809.GA38528@icarus.home.lan> References: <000001c93e35$9d456790$d7d036b0$@com> <20081104045809.GA38528@icarus.home.lan> Message-ID: <000f01c93e7a$0a146210$1e3d2630$@com> > Can you please try 7.1-PRERELEASE or 6.4-RC2 (just announced today)? > There have been bootstrap-related changes since 7.0-RELEASE which may > fix your problem. I tried 7.1-BETA2 , but unfortunately the same problem happened. I tried (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't boot either -- it said "Your CPU does not support long mode, please use a 32bit distribution". How would I support over 4GB of ram with only i386 distributions? Thank you. From koitsu at FreeBSD.org Tue Nov 4 04:48:55 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Tue Nov 4 04:49:02 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <000f01c93e7a$0a146210$1e3d2630$@com> References: <000001c93e35$9d456790$d7d036b0$@com> <20081104045809.GA38528@icarus.home.lan> <000f01c93e7a$0a146210$1e3d2630$@com> Message-ID: <20081104124853.GA48015@icarus.home.lan> On Tue, Nov 04, 2008 at 07:37:01AM -0500, Kevin wrote: > > Can you please try 7.1-PRERELEASE or 6.4-RC2 (just announced today)? > > There have been bootstrap-related changes since 7.0-RELEASE which may > > fix your problem. > > > I tried 7.1-BETA2 , but unfortunately the same problem happened. I tried > (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't boot > either -- it said "Your CPU does not support long mode, please use a 32bit > distribution". This means your processor does not support 64-bit mode. > How would I support over 4GB of ram with only i386 distributions? There is only one option: use PAE mode, which has drawbacks. You can read about what PAE is on Wikipedia. Note that not all drivers are PAE mode compatible on FreeBSD. You should be able to install i386 FreeBSD with success, then rebuild your kernel with PAE enabled. Look at /sys/i386/conf/PAE for an example configuration -- you'll see all of the drivers you have to disable for PAE to work successfully. If your system uses any of these drivers, PAE mode will not work for you, in which case you should upgrade your hardware. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From dudu.meyer at gmail.com Tue Nov 4 06:12:52 2008 From: dudu.meyer at gmail.com (Eduardo Meyer) Date: Tue Nov 4 06:12:58 2008 Subject: Disk top usage PIDs Message-ID: Hello, I have some serious issue. Sometimes something happens and my disk usage performance find its limit quickly. I follow with gstat and iostat -xw1, and everything usually happens just fine, with %b around 20 and 0 to 1 pending i/o request. Suddely I get 30, 40 pending requests and %b is always on 100% (or more than this). fstat and lsof gives me no hint, because the type of programs as well as the amount of 'em is just the same. How can I find the PID which is hammering my disk? Is there an "iotop" or "disktop" tool or something alike? Its a mail server. I have pop3, imap, I also have maildrop and sometimes, httpd, working around the busiest mount point. I have also started AUDIT, however all I can get are the top PIDs which issue read/write requests. Not the requests which take longer to perform (the busiest ones), or should I look for some special audit class or event other than open, read and write? Thank you in advance. -- =========== Eduardo Meyer pessoal: dudu.meyer@gmail.com profissional: ddm.farmaciap@saude.gov.br From d_elbracht at ecngs.de Tue Nov 4 06:47:20 2008 From: d_elbracht at ecngs.de (d_elbracht) Date: Tue Nov 4 06:47:26 2008 Subject: Block device In-Reply-To: References: Message-ID: <004901c93e8a$1b556500$639049d9@EC1a> Hi list, can someone please explain, why stat -x /dev/da1 show the SCSI-Drive as a character-device ? Also, on the same device S_ISREG(st.st_mode) is false S_ISCHR(st.st_mode) is true thanks Dieter From max at love2party.net Tue Nov 4 06:51:04 2008 From: max at love2party.net (Max Laier) Date: Tue Nov 4 06:51:11 2008 Subject: Block device In-Reply-To: <004901c93e8a$1b556500$639049d9@EC1a> References: <004901c93e8a$1b556500$639049d9@EC1a> Message-ID: <200811041551.00593.max@love2party.net> On Tuesday 04 November 2008 15:32:01 d_elbracht wrote: > Hi list, > > can someone please explain, why > stat -x /dev/da1 > > show the SCSI-Drive as a character-device ? > > Also, on the same device > > S_ISREG(st.st_mode) is false > > S_ISCHR(st.st_mode) is true http://www.freebsd.org/doc/en/books/arch-handbook/driverbasics-block.html -- /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News From tcanich at geosc.psu.edu Tue Nov 4 06:51:35 2008 From: tcanich at geosc.psu.edu (Tom Canich) Date: Tue Nov 4 06:51:42 2008 Subject: Atheros AR5141 In-Reply-To: <53554.217.45.165.129.1225797313.squirrel@www.violetlan.net> References: <53554.217.45.165.129.1225797313.squirrel@www.violetlan.net> Message-ID: <20081104141532.GA1076@geosc.psu.edu> > Are there a new version of the ath driver somewhere that does support this > chipset? Hello Reinhold, The latest ath_hal is available from http://people.freebsd.org/~sam/ . Tom From max at love2party.net Tue Nov 4 07:03:38 2008 From: max at love2party.net (Max Laier) Date: Tue Nov 4 07:03:45 2008 Subject: Block device In-Reply-To: <004901c93e8a$1b556500$639049d9@EC1a> References: <004901c93e8a$1b556500$639049d9@EC1a> Message-ID: <200811041551.00593.max@love2party.net> On Tuesday 04 November 2008 15:32:01 d_elbracht wrote: > Hi list, > > can someone please explain, why > stat -x /dev/da1 > > show the SCSI-Drive as a character-device ? > > Also, on the same device > > S_ISREG(st.st_mode) is false > > S_ISCHR(st.st_mode) is true http://www.freebsd.org/doc/en/books/arch-handbook/driverbasics-block.html -- /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News From hausen at punkt.de Tue Nov 4 07:24:50 2008 From: hausen at punkt.de (Patrick M. Hausen) Date: Tue Nov 4 07:24:59 2008 Subject: Block device In-Reply-To: <004901c93e8a$1b556500$639049d9@EC1a> References: <004901c93e8a$1b556500$639049d9@EC1a> Message-ID: <20081104145144.GB14539@hugo10.ka.punkt.de> h, all, On Tue, Nov 04, 2008 at 03:32:01PM +0100, d_elbracht wrote: > Hi list, > > can someone please explain, why > stat -x /dev/da1 > > show the SCSI-Drive as a character-device ? http://www.freebsd.org/doc/en/books/arch-handbook/driverbasics-block.html HTH, Patrick M. Hausen Leiter Netzwerke und Sicherheit -- punkt.de GmbH * Kaiserallee 13a * 76133 Karlsruhe Tel. 0721 9109 0 * Fax 0721 9109 100 info@punkt.de http://www.punkt.de Gf: J?rgen Egeling AG Mannheim 108285 From k at kevinkevin.com Tue Nov 4 08:18:50 2008 From: k at kevinkevin.com (Kevin) Date: Tue Nov 4 08:18:57 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <20081104124853.GA48015@icarus.home.lan> References: <000001c93e35$9d456790$d7d036b0$@com> <20081104045809.GA38528@icarus.home.lan> <000f01c93e7a$0a146210$1e3d2630$@com> <20081104124853.GA48015@icarus.home.lan> Message-ID: <001b01c93e98$fe2e7b60$fa8b7220$@com> > > I tried 7.1-BETA2 , but unfortunately the same problem happened. I > tried > > (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't > boot > > either -- it said "Your CPU does not support long mode, please use a > 32bit > > distribution". > > This means your processor does not support 64-bit mode. > > > How would I support over 4GB of ram with only i386 distributions? > > There is only one option: use PAE mode, which has drawbacks. You can > read about what PAE is on Wikipedia. > > Note that not all drivers are PAE mode compatible on FreeBSD. You > should be able to install i386 FreeBSD with success, then rebuild your > kernel with PAE enabled. Look at /sys/i386/conf/PAE for an example > configuration -- you'll see all of the drivers you have to disable for > PAE to work successfully. If your system uses any of these drivers, > PAE > mode will not work for you, in which case you should upgrade your > hardware. > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | Is my issue related to non-supporting of 64bit mode (2x Dual Xeons in the DL360 G3) , or perhaps due to a boot loader bug? I found this PR : 91492 freebsd- amd64 feedback serious medium current-us [boot] BTX halted In any case , the latest 6.4/7.1 Snapshot produced the same BTX Halt error. I'll just use i386 w/ PAE for now I suppose. I haven't tested AMD64 with DL360 G4's or DL360 G5's , but I'd appreciate if anyone has tested those generations w/ FreeBSD AMD64 , to let me know if the problem persists in some form. Thank you. From scottl at samsco.org Tue Nov 4 08:33:26 2008 From: scottl at samsco.org (Scott Long) Date: Tue Nov 4 08:33:37 2008 Subject: Block device In-Reply-To: <20081104145144.GB14539@hugo10.ka.punkt.de> References: <004901c93e8a$1b556500$639049d9@EC1a> <20081104145144.GB14539@hugo10.ka.punkt.de> Message-ID: <49107933.7070907@samsco.org> Patrick M. Hausen wrote: > h, all, > > On Tue, Nov 04, 2008 at 03:32:01PM +0100, d_elbracht wrote: >> Hi list, >> >> can someone please explain, why >> stat -x /dev/da1 >> >> show the SCSI-Drive as a character-device ? > > http://www.freebsd.org/doc/en/books/arch-handbook/driverbasics-block.html > Wow that's a confusing and misleading article. 1. disk access in the driver layer still happens on a block basis. It's true that to the application layer, the device has character dev semantics, meaning that arbitrary numbers of bytes can be accessed randomly without any restrictions. But deep down inside the kernel, it's still doing block-by-block access. 2. caching still happens at the filesystem level. Doing I/O directly to /dev/daX or adX or whatever will not be cached/buffered, but doing I/O to a file on any of these devices will. 3. Cache coherency between the block and character device representations was indeed an issue, but removing the block/cached representation was really a matter of policy over tools, and it's one reason why FreeBSD gets creamed in the silly-io-benchmarch department. 4. However, in the not-so-silly-io-benchmark department, I think FreeBSD does a whole lot better because you don't have the blind caching of the block device trying to out-guess what the filesystem is trying to do. Scott From koitsu at FreeBSD.org Tue Nov 4 08:41:36 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Tue Nov 4 08:41:43 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <001b01c93e98$fe2e7b60$fa8b7220$@com> References: <000001c93e35$9d456790$d7d036b0$@com> <20081104045809.GA38528@icarus.home.lan> <000f01c93e7a$0a146210$1e3d2630$@com> <20081104124853.GA48015@icarus.home.lan> <001b01c93e98$fe2e7b60$fa8b7220$@com> Message-ID: <20081104164131.GA52756@icarus.home.lan> On Tue, Nov 04, 2008 at 11:18:36AM -0500, Kevin wrote: > > > I tried 7.1-BETA2 , but unfortunately the same problem happened. I > > tried > > > (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't > > boot > > > either -- it said "Your CPU does not support long mode, please use a > > 32bit > > > distribution". > > > > This means your processor does not support 64-bit mode. > > > > > How would I support over 4GB of ram with only i386 distributions? > > > > There is only one option: use PAE mode, which has drawbacks. You can > > read about what PAE is on Wikipedia. > > > > Note that not all drivers are PAE mode compatible on FreeBSD. You > > should be able to install i386 FreeBSD with success, then rebuild your > > kernel with PAE enabled. Look at /sys/i386/conf/PAE for an example > > configuration -- you'll see all of the drivers you have to disable for > > PAE to work successfully. If your system uses any of these drivers, > > PAE > > mode will not work for you, in which case you should upgrade your > > hardware. > > > > -- > > | Jeremy Chadwick jdc at parodius.com | > > | Parodius Networking http://www.parodius.com/ | > > | UNIX Systems Administrator Mountain View, CA, USA | > > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > Is my issue related to non-supporting of 64bit mode (2x Dual Xeons in the > DL360 G3) , or perhaps due to a boot loader bug? > > I found this PR : > > 91492 freebsd- amd64 feedback serious medium current-us [boot] BTX > halted There have been a number of recent reports of "BTX halted" on all sorts of hardware. So far each problem has been unique; there doesn't appear to be a "thing" that explains it for everyone. > In any case , the latest 6.4/7.1 Snapshot produced the same BTX Halt error. > I'll just use i386 w/ PAE for now I suppose. I haven't tested AMD64 with > DL360 G4's or DL360 G5's , but I'd appreciate if anyone has tested those > generations w/ FreeBSD AMD64 , to let me know if the problem persists in > some form. John Baldwin will have to correct me if I'm wrong, but I'm fairly sure the FreeBSD bootstraps operate in pure i386 real mode up until boot2/loader. John, can you confirm? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From tevans.uk at googlemail.com Tue Nov 4 08:59:55 2008 From: tevans.uk at googlemail.com (Tom Evans) Date: Tue Nov 4 09:00:02 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <001b01c93e98$fe2e7b60$fa8b7220$@com> References: <000001c93e35$9d456790$d7d036b0$@com> <20081104045809.GA38528@icarus.home.lan> <000f01c93e7a$0a146210$1e3d2630$@com> <20081104124853.GA48015@icarus.home.lan> <001b01c93e98$fe2e7b60$fa8b7220$@com> Message-ID: <1225817988.38506.4.camel@localhost> On Tue, 2008-11-04 at 11:18 -0500, Kevin wrote: > > > I tried 7.1-BETA2 , but unfortunately the same problem happened. I > > tried > > > (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't > > boot > > > either -- it said "Your CPU does not support long mode, please use a > > 32bit > > > distribution". > > > > This means your processor does not support 64-bit mode. > > > > > How would I support over 4GB of ram with only i386 distributions? > > > > There is only one option: use PAE mode, which has drawbacks. You can > > read about what PAE is on Wikipedia. > > > > Note that not all drivers are PAE mode compatible on FreeBSD. You > > should be able to install i386 FreeBSD with success, then rebuild your > > kernel with PAE enabled. Look at /sys/i386/conf/PAE for an example > > configuration -- you'll see all of the drivers you have to disable for > > PAE to work successfully. If your system uses any of these drivers, > > PAE > > mode will not work for you, in which case you should upgrade your > > hardware. > > > > -- > > | Jeremy Chadwick jdc at parodius.com | > > | Parodius Networking http://www.parodius.com/ | > > | UNIX Systems Administrator Mountain View, CA, USA | > > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > Is my issue related to non-supporting of 64bit mode (2x Dual Xeons in the > DL360 G3) , or perhaps due to a boot loader bug? > > I found this PR : > > 91492 freebsd- amd64 feedback serious medium current-us [boot] BTX > halted > > > In any case , the latest 6.4/7.1 Snapshot produced the same BTX Halt error. > I'll just use i386 w/ PAE for now I suppose. I haven't tested AMD64 with > DL360 G4's or DL360 G5's , but I'd appreciate if anyone has tested those > generations w/ FreeBSD AMD64 , to let me know if the problem persists in > some form. > > > Thank you. If your CPUs don't support LM (long mode, aka amd64), then booting an amd64 image wont get very far. Judging from the ubuntu message, your CPUs dont. If you have installed/booted FreeBSD, you can find out what your CPU supports by looking at the Features mentioned in dmesg: > $ grep Features /var/run/dmesg.boot Features=0xbfebfbff Features2=0xe3fd AMD Features=0x20100000 AMD Features2=0x1 Iif your CPU supported amd64, it would be mentioned in AMD Features (as LM). Tom From mcdouga9 at egr.msu.edu Tue Nov 4 09:03:47 2008 From: mcdouga9 at egr.msu.edu (Adam McDougall) Date: Tue Nov 4 09:03:54 2008 Subject: Disk top usage PIDs In-Reply-To: References: Message-ID: <49107CA1.5090309@egr.msu.edu> Eduardo Meyer wrote: > Hello, > > I have some serious issue. Sometimes something happens and my disk > usage performance find its limit quickly. I follow with gstat and > iostat -xw1, and everything usually happens just fine, with %b around > 20 and 0 to 1 pending i/o request. Suddely I get 30, 40 pending > requests and %b is always on 100% (or more than this). > > fstat and lsof gives me no hint, because the type of programs as well > as the amount of 'em is just the same. > > How can I find the PID which is hammering my disk? Is there an "iotop" > or "disktop" tool or something alike? > > Its a mail server. I have pop3, imap, I also have maildrop and > sometimes, httpd, working around the busiest mount point. > > I have also started AUDIT, however all I can get are the top PIDs > which issue read/write requests. Not the requests which take longer to > perform (the busiest ones), or should I look for some special audit > class or event other than open, read and write? > > Thank you in advance. > > top -mio From jhb at freebsd.org Tue Nov 4 09:11:13 2008 From: jhb at freebsd.org (John Baldwin) Date: Tue Nov 4 09:11:20 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <20081104164131.GA52756@icarus.home.lan> References: <000001c93e35$9d456790$d7d036b0$@com> <001b01c93e98$fe2e7b60$fa8b7220$@com> <20081104164131.GA52756@icarus.home.lan> Message-ID: <200811041210.04875.jhb@freebsd.org> On Tuesday 04 November 2008 11:41:31 am Jeremy Chadwick wrote: > On Tue, Nov 04, 2008 at 11:18:36AM -0500, Kevin wrote: > > > > I tried 7.1-BETA2 , but unfortunately the same problem happened. I > > > tried > > > > (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't > > > boot > > > > either -- it said "Your CPU does not support long mode, please use a > > > 32bit > > > > distribution". > > > > > > This means your processor does not support 64-bit mode. > > > > > > > How would I support over 4GB of ram with only i386 distributions? > > > > > > There is only one option: use PAE mode, which has drawbacks. You can > > > read about what PAE is on Wikipedia. > > > > > > Note that not all drivers are PAE mode compatible on FreeBSD. You > > > should be able to install i386 FreeBSD with success, then rebuild your > > > kernel with PAE enabled. Look at /sys/i386/conf/PAE for an example > > > configuration -- you'll see all of the drivers you have to disable for > > > PAE to work successfully. If your system uses any of these drivers, > > > PAE > > > mode will not work for you, in which case you should upgrade your > > > hardware. > > > > > > -- > > > | Jeremy Chadwick jdc at parodius.com | > > > | Parodius Networking http://www.parodius.com/ | > > > | UNIX Systems Administrator Mountain View, CA, USA | > > > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > > > > Is my issue related to non-supporting of 64bit mode (2x Dual Xeons in the > > DL360 G3) , or perhaps due to a boot loader bug? > > > > I found this PR : > > > > 91492 freebsd- amd64 feedback serious medium current-us [boot] BTX > > halted > > There have been a number of recent reports of "BTX halted" on all sorts > of hardware. So far each problem has been unique; there doesn't appear > to be a "thing" that explains it for everyone. > > > In any case , the latest 6.4/7.1 Snapshot produced the same BTX Halt error. > > I'll just use i386 w/ PAE for now I suppose. I haven't tested AMD64 with > > DL360 G4's or DL360 G5's , but I'd appreciate if anyone has tested those > > generations w/ FreeBSD AMD64 , to let me know if the problem persists in > > some form. > > John Baldwin will have to correct me if I'm wrong, but I'm fairly sure > the FreeBSD bootstraps operate in pure i386 real mode up until > boot2/loader. John, can you confirm? Yes, the BTX fault you are getting is from trying to start 64-bit mode on a CPU that doesn't support 32-bit mode. The latest snapshots should have a fix where you get a more helpful "Your CPU doesn't do 64-bit" message. -- John Baldwin From k at kevinkevin.com Tue Nov 4 09:25:49 2008 From: k at kevinkevin.com (Kevin) Date: Tue Nov 4 09:25:57 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <200811041210.04875.jhb@freebsd.org> References: <000001c93e35$9d456790$d7d036b0$@com> <001b01c93e98$fe2e7b60$fa8b7220$@com> <20081104164131.GA52756@icarus.home.lan> <200811041210.04875.jhb@freebsd.org> Message-ID: <005f01c93ea2$5392e650$fab8b2f0$@com> > Yes, the BTX fault you are getting is from trying to start 64-bit mode > on a > CPU that doesn't support 32-bit mode. The latest snapshots should have > a > fix where you get a more helpful "Your CPU doesn't do 64-bit" message. > > -- > John Baldwin Thank you all for your prudent help, it is much appreciated. From wb at freebie.xs4all.nl Tue Nov 4 11:24:43 2008 From: wb at freebie.xs4all.nl (Wilko Bulte) Date: Tue Nov 4 11:24:51 2008 Subject: DL360 G3 w/ AMD64 Cant boot from CD In-Reply-To: <200811041210.04875.jhb@freebsd.org> References: <000001c93e35$9d456790$d7d036b0$@com> <001b01c93e98$fe2e7b60$fa8b7220$@com> <20081104164131.GA52756@icarus.home.lan> <200811041210.04875.jhb@freebsd.org> Message-ID: <20081104191107.GA1269@freebie.xs4all.nl> Quoting John Baldwin, who wrote on Tue, Nov 04, 2008 at 12:10:04PM -0500 .. > On Tuesday 04 November 2008 11:41:31 am Jeremy Chadwick wrote: > > On Tue, Nov 04, 2008 at 11:18:36AM -0500, Kevin wrote: > > > > > I tried 7.1-BETA2 , but unfortunately the same problem happened. I > > > > tried > > > > > (for the sake of argument) Debian debian-40r5-amd64 , and it wouldn't > > > > boot > > > > > either -- it said "Your CPU does not support long mode, please use a > > > > 32bit > > > > > distribution". > > > > > > > > This means your processor does not support 64-bit mode. > > > > > > > > > How would I support over 4GB of ram with only i386 distributions? > > > > > > > > There is only one option: use PAE mode, which has drawbacks. You can > > > > read about what PAE is on Wikipedia. > > > > > > > > Note that not all drivers are PAE mode compatible on FreeBSD. You > > > > should be able to install i386 FreeBSD with success, then rebuild your > > > > kernel with PAE enabled. Look at /sys/i386/conf/PAE for an example > > > > configuration -- you'll see all of the drivers you have to disable for > > > > PAE to work successfully. If your system uses any of these drivers, > > > > PAE > > > > mode will not work for you, in which case you should upgrade your > > > > hardware. > > > > > > > > -- > > > > | Jeremy Chadwick jdc at parodius.com | > > > > | Parodius Networking http://www.parodius.com/ | > > > > | UNIX Systems Administrator Mountain View, CA, USA | > > > > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > > > > > > > Is my issue related to non-supporting of 64bit mode (2x Dual Xeons in the > > > DL360 G3) , or perhaps due to a boot loader bug? > > > > > > I found this PR : > > > > > > 91492 freebsd- amd64 feedback serious medium current-us [boot] BTX > > > halted > > > > There have been a number of recent reports of "BTX halted" on all sorts > > of hardware. So far each problem has been unique; there doesn't appear > > to be a "thing" that explains it for everyone. > > > > > In any case , the latest 6.4/7.1 Snapshot produced the same BTX Halt > error. > > > I'll just use i386 w/ PAE for now I suppose. I haven't tested AMD64 with > > > DL360 G4's or DL360 G5's , but I'd appreciate if anyone has tested those > > > generations w/ FreeBSD AMD64 , to let me know if the problem persists in > > > some form. > > > > John Baldwin will have to correct me if I'm wrong, but I'm fairly sure > > the FreeBSD bootstraps operate in pure i386 real mode up until > > boot2/loader. John, can you confirm? > > Yes, the BTX fault you are getting is from trying to start 64-bit mode on a > CPU that doesn't support 32-bit mode. The latest snapshots should have a does not support 64-bit mode I guess? > fix where you get a more helpful "Your CPU doesn't do 64-bit" message. > > -- > John Baldwin > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" --- End of quoted text --- From ronald-freebsd8 at klop.yi.org Tue Nov 4 16:04:43 2008 From: ronald-freebsd8 at klop.yi.org (Ronald Klop) Date: Tue Nov 4 16:04:55 2008 Subject: Block device In-Reply-To: <49107933.7070907@samsco.org> References: <004901c93e8a$1b556500$639049d9@EC1a> <20081104145144.GB14539@hugo10.ka.punkt.de> <49107933.7070907@samsco.org> Message-ID: On Tue, 04 Nov 2008 17:32:51 +0100, Scott Long wrote: > Patrick M. Hausen wrote: >> h, all, >> On Tue, Nov 04, 2008 at 03:32:01PM +0100, d_elbracht wrote: >>> Hi list, >>> >>> can someone please explain, why stat -x /dev/da1 >>> >>> show the SCSI-Drive as a character-device ? >> >> http://www.freebsd.org/doc/en/books/arch-handbook/driverbasics-block.html >> > > Wow that's a confusing and misleading article. > > 1. disk access in the driver layer still happens on a block basis. It's > true that to the application layer, the device has character dev > semantics, meaning that arbitrary numbers of bytes can be accessed > randomly without any restrictions. But deep down inside the kernel, > it's still doing block-by-block access. > > 2. caching still happens at the filesystem level. Doing I/O directly to > /dev/daX or adX or whatever will not be cached/buffered, but doing I/O > to a file on any of these devices will. > > 3. Cache coherency between the block and character device > representations was indeed an issue, but removing the block/cached > representation was really a matter of policy over tools, and it's > one reason why FreeBSD gets creamed in the silly-io-benchmarch > department. > > 4. However, in the not-so-silly-io-benchmark department, I think FreeBSD > does a whole lot better because you don't have the blind caching of the > block device trying to out-guess what the filesystem is trying to do. > > Scott This explains some things to me as a simple user reading the linked article. I'm not a kernel programmer, but do understand computers and this article made me wonder if I missed something last years about disks and caches. Ronald. From bms at incunabulum.net Tue Nov 4 18:16:08 2008 From: bms at incunabulum.net (Bruce M Simpson) Date: Tue Nov 4 18:16:15 2008 Subject: [Fwd: Profiling on FreeBSD] Message-ID: <491101E6.9000903@incunabulum.net> I got this response from Robert, the root cause sounds plausible (amd64 vs i386 not preserving ecx). Any chance of an MFC? I see one is pending in the PR. If I can test and verify the change I could MFC. -------- Original Message -------- Subject: Profiling on FreeBSD Date: Wed, 5 Nov 2008 08:47:37 +1100 From: Robert Jenssen To: bms@incunabulum.net Hi, GCC 4 gprof profiling for i386 has been broken on FreeBSD for quite a while. See PR bin/119709. It appears that the fix has been MFC'd recently: $FreeBSD: src/sys/i386/include/profile.h,v 1.42.2.1 2008/10/13 12:45:18 kib Exp $ (I've been successfully using gprof by patching using the code in the PR but haven't rebuilt my system with the above change) Is there a similar problem for AMD64? Cheers, Rob Jenssen From avg at icyb.net.ua Wed Nov 5 01:29:03 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 5 01:29:10 2008 Subject: ukbd in kernel and in loader.conf: lockup? Message-ID: <49115371.1080701@icyb.net.ua> I observe the symptoms that are the same as described here only with 7-BETA2 (amd64): http://lists.freebsd.org/pipermail/freebsd-hackers/2006-March/015736.html If ukbd is both in kernel and is loaded via module then kernel boot locks up. I have a custom kernel without USB in it (everything is loaded through modules), but sometimes I need to boot GENERIC. So the issue occurs with GENERIC. Unfortunately, loader.conf is not per kernel (and I haven't heard about kernel conditionals in it). Does anybody have any insights on this? In the referred thread John Baldwin mentioned some "edge cases", but nothing concrete. I think that such behavior is not satisfactory. I am willing to help with researching this, but I need some starting pointers. Unfortunately, no serial console. P.S. This is legacy-free machine, no PS/2 port (but I am not sure about actual keyboard controller in hardware). BIOS has "USB Legacy" enabled. GENERIC, obviously, has atkbd and kbdmux in it. -- Andriy Gapon From avg at icyb.net.ua Wed Nov 5 07:37:40 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 5 07:37:46 2008 Subject: ukbd attachment and root mount Message-ID: <4911BA93.9030006@icyb.net.ua> System is FreeBSD 7.1-BETA2 amd64. Looking through my dmesg I see that relative order of ukbd attachment and root mounting is not deterministic. Sometime keyboard is attached first, sometimes root filesystem is mounted first. Quite more often root is mounted first, though. Example (with GENERIC kernel): Nov 3 15:40:54 kernel: Trying to mount root from ufs:/dev/mirror/bootgm Nov 3 15:40:54 kernel: GEOM_LABEL: Label ufs/bootfs removed. Nov 3 15:40:54 kernel: GEOM_LABEL: Label for provider mirror/bootgm is ufs/bootfs. Nov 3 15:40:54 kernel: GEOM_LABEL: Label ufs/bootfs removed. Nov 3 15:40:54 kernel: ukbd0: on uhub2 Nov 3 15:40:54 kernel: kbd2 at ukbd0 Nov 3 15:40:54 kernel: uhid0: on uhub2 Another (with custom kernel, zfs root): Nov 4 17:54:03 odyssey kernel: Trying to mount root from zfs:tank/root Nov 4 17:54:03 odyssey kernel: ukbd0: on uhub2 Nov 4 17:54:03 odyssey kernel: kbd2 at ukbd0 Nov 4 17:54:03 odyssey kernel: kbd2: ukbd0, generic (0), config:0x0, flags:0x3d0000 Nov 4 17:54:03 odyssey kernel: uhid0: on uhub2 I have a legacy-free system (no PS/2 ports, only USB) and I wanted to try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was bitten hard when I made a mistake and kernel could not find/mount root filesystem. So I stuck at mountroot prompt without a keyboard to enter anything. This was repeatable about 10 times after which I resorted to live cd. Since then I put back atkbdc into my kernel. I guess BIOS or USB hardware emulate AT or PS/2 keyboard, so the USB keyboard works before the driver attaches. I guess I need such emulation e.g. for loader or boot0 configuration. But I guess I don't have to have atkbd driver in kernel. I wonder why behavior is non-deterministic, why ukbd is "99%" attached after mount root, and what can be done about this. -- Andriy Gapon From david at catwhisker.org Wed Nov 5 07:56:24 2008 From: david at catwhisker.org (David Wolfskill) Date: Wed Nov 5 07:56:32 2008 Subject: Using r/o root with amd(8)-mounted file systems Message-ID: <20081105153114.GA37748@bunrab.catwhisker.org> In networks that I control and which are "sufficiently small" while having enough resources to make it practical -- such as at home -- I like to do a few things to split up the workload and make the "common case" (of merely quietly doing their jobs) easier for the bulk of the machines ... at the expense of needing to tweak things a bit initially to get there, and needing to do a bit more work for upgrades. For example, one of the things I like to do is set up "production" machines (e.g., my firewall box and the central mail server) so they: * Each have 2 separate bootable slices, each of which contains a fully-functional root on the "a" partition and /usr on the "d" partition, and a 3rd slice to contain "everything else" (that is used regardless of which slice is the current boot slice): swap space, /var, and a file system that contains the directories where the /home and /usr/local symlinks point. (Yes, I make /usr/local a symlink.) Because I can easily control which slice is the default boot slice via boot0cfg(8), I use the FreeBSD boot loader. * Use NIS for "installation-wide" notions of users & groups. (Hey; one of the machines at home is a SPARCstation 5/170, after all.) * Use NFS for making certain file systems & directories have an "appearance" on the local machine. (Home directories & a few others are presently served by the above-cited SS5/170, though I've started lobbying the "family CFO" to free up funds to migrate that job to a ReadyNAS. /usr/{obj,ports,src} are hosted on the build machine.) * Avoid "hard" NFS mounts. I use amd(8) to manage the NFS mounts, and it's been working well for me for around a decade or so. * Do not have their own /usr/src, /usr/obj, or /usr/ports directory hierarchies. Rather, these are NFS-mounted from a dedicated "build machine" that has no role in the usual day-to-day "production" activities. the build machine has a local private mirror of the FreeBSD CVS repository which I update in 2 stages overnight (via cron(8), of course), and I track branches of interest on it, usually daily, as well as update ports on it daily. At present, I'm tracking RELENG_6, RELENG_7, and HEAD. Thus, the build machine, in addition to building the "world" (userland) and its own kernel, also builds kernels for the other machines. * Mount /usr read-only. Yes, this becomes a slight nuisance when it's time to upgrade, but that nearly vanishes inside a few csh(1) aliases. It's slightly more annoying when it's time to upgrade ports on production machines, but I still find it useful: it provides a degree of assurance that things aren't likely changing without my knowledge. And should there be a reboot, that's one more file system that need not be checked. (And there have been cases where the UPS batteries haven't lasted as long as an electrical supply outage.) The above all have been working well for me -- as long as I've had a working build machine, anyway. I had tried mounting the root file system read-only (back in 3.x days); while it mostly worked, sshd(8) threw a bit of a hissy-fit because it couldn't chown(1) a pty entry in /dev. And since my normal mode of operation is to access everything from my laptop (running FreeBSD, of course) vis ssh(1), I wasn't too keen on risking running afoul of sshd(8). :-} Now that /dev is merely a figment of the kernel's imagination :-}, I thought I'd re-try mounting root as read-only. As expected, sshd(8) didn't complain -- at least, not about ownership of a pty. What I did encounter -- at least sometimes -- is that If I specify that / is read-only in /etc/fstab, on reboot: * sometimes everything work nicely. * other times, the interaction between the read-only root and amd(8) is such that amd(8) is started, but doesn't actually work. In such cases, a workaround is to mount root read-write, restart amd(8), then mount root read-only. I'm a bit bothered by the nuisance of the latter, but even more concerned about the apparent lack of determinism in the process. Any ideas on how to track this down? The most recent occurrence was on a machine I'm in the process of setting up to replace our internal mail server: albert(7.1-P)[1] uname -a FreeBSD albert.catwhisker.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #1: Wed Nov 5 05:31:00 PST 2008 root@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/ALBERT i386 albert(7.1-P)[2] I rebooted it about 5 times in succession with amd(8) failing to do its job; on the first & the last of these, I performed the above-cited "workaround", after which a reboot came up "normally": albert(7.1-P)[2] mount /dev/ad0s2a on / (ufs, local, read-only, soft-updates) devfs on /dev (devfs, local) /dev/ad0s2d on /usr (ufs, NFS exported, local, read-only) /dev/ad0s3d on /var (ufs, local, soft-updates) /dev/ad0s3e on /bkp (ufs, local, soft-updates) /dev/ad1s1d on /common (ufs, local, soft-updates) /dev/md0 on /tmp (ufs, asynchronous, local) pid660@albert:/host on /host (nfs) pid660@albert:/net on /net (nfs) pogo:/cdrom on /.amd_mnt/pogo/host/cdrom (nfs, nosuid) pogo:/export on /.amd_mnt/pogo/host/export (nfs, nosuid) pogo:/export/bd1 on /.amd_mnt/pogo/host/export/bd1 (nfs, nosuid) pogo:/export/bd2 on /.amd_mnt/pogo/host/export/bd2 (nfs, nosuid) pogo:/export/home on /.amd_mnt/pogo/host/export/home (nfs, nosuid) pogo:/export/local on /.amd_mnt/pogo/host/export/local (nfs, nosuid) albert(7.1-P)[3] uptime 7:29AM up 17 mins, 1 user, load averages: 0.00, 0.00, 0.01 albert(7.1-P)[4] Deploying the machine in production is neither urgent nor critical at this point, so I have some time to work on it. Here's where rcorder(8) has to say: albert(7.1-P)[3] rcorder /etc/rc.d/* /usr/local/etc/rc.d/* /etc/rc.d/dumpon /etc/rc.d/ddb /etc/rc.d/initrandom /etc/rc.d/geli /etc/rc.d/gbde /etc/rc.d/encswap /etc/rc.d/ccd /etc/rc.d/swap1 /etc/rc.d/early.sh /etc/rc.d/fsck /etc/rc.d/root /etc/rc.d/hostid /etc/rc.d/mdconfig /etc/rc.d/mountcritlocal /etc/rc.d/zfs /etc/rc.d/FILESYSTEMS /etc/rc.d/var /etc/rc.d/cleanvar /etc/rc.d/random /etc/rc.d/adjkerntz /etc/rc.d/atm1 /etc/rc.d/hostname /etc/rc.d/ipfilter /etc/rc.d/ipnat /etc/rc.d/ipfs /etc/rc.d/kldxref /etc/rc.d/sppp /etc/rc.d/addswap /etc/rc.d/auto_linklocal /etc/rc.d/sysctl /etc/rc.d/serial /etc/rc.d/netif /etc/rc.d/ip6addrctl /etc/rc.d/atm2 /etc/rc.d/pfsync /etc/rc.d/pflog /etc/rc.d/pf /etc/rc.d/isdnd /etc/rc.d/ppp /etc/rc.d/routing /etc/rc.d/ip6fw /etc/rc.d/network_ipv6 /etc/rc.d/devd /etc/rc.d/ipsec /etc/rc.d/ipfw /etc/rc.d/nsswitch /etc/rc.d/resolv /etc/rc.d/mroute6d /etc/rc.d/route6d /etc/rc.d/mrouted /etc/rc.d/routed /etc/rc.d/netoptions /etc/rc.d/NETWORKING /etc/rc.d/mountcritremote /etc/rc.d/ldconfig /etc/rc.d/tmp /etc/rc.d/cleartmp /usr/local/etc/rc.d/xfs /usr/local/etc/rc.d/xdm.sh.noauto /usr/local/etc/rc.d/rplayd.sh.sample /etc/rc.d/accounting /etc/rc.d/devfs /etc/rc.d/ipmon /etc/rc.d/mdconfig2 /etc/rc.d/newsyslog /etc/rc.d/syslogd /etc/rc.d/savecore /etc/rc.d/archdep /etc/rc.d/abi /etc/rc.d/SERVERS /etc/rc.d/named /etc/rc.d/ntpdate /etc/rc.d/rpcbind /etc/rc.d/nfsclient /etc/rc.d/nisdomain /etc/rc.d/ypserv /etc/rc.d/ypbind /etc/rc.d/amd /etc/rc.d/atm3 /etc/rc.d/auditd /etc/rc.d/dmesg /etc/rc.d/ipxrouted /etc/rc.d/kerberos /etc/rc.d/kadmind /etc/rc.d/keyserv /etc/rc.d/kpasswdd /etc/rc.d/quota /etc/rc.d/nfsserver /etc/rc.d/mountd /etc/rc.d/nfsd /etc/rc.d/statd /etc/rc.d/lockd /etc/rc.d/pppoed /etc/rc.d/pwcheck /etc/rc.d/virecover /etc/rc.d/DAEMON /etc/rc.d/apm /etc/rc.d/apmd /etc/rc.d/bootparams /etc/rc.d/hcsecd /etc/rc.d/bthidd /etc/rc.d/local /etc/rc.d/lpd /etc/rc.d/motd /etc/rc.d/mountlate /etc/rc.d/nscd /etc/rc.d/ntpd /etc/rc.d/powerd /etc/rc.d/rarpd /etc/rc.d/sdpd /etc/rc.d/rfcomm_pppd_server /etc/rc.d/rtadvd /etc/rc.d/rwho /etc/rc.d/timed /etc/rc.d/ugidfw /etc/rc.d/yppasswdd /etc/rc.d/LOGIN /usr/local/etc/rc.d/mysql-server /usr/local/etc/rc.d/htcacheclean /usr/local/etc/rc.d/dbus rcorder: requirement `usbd' in file `/usr/local/etc/rc.d/hald' has no providers. /usr/local/etc/rc.d/hald /usr/local/etc/rc.d/ffserver /usr/local/etc/rc.d/apache22 /etc/rc.d/ypxfrd /etc/rc.d/ypupdated /etc/rc.d/ypset /etc/rc.d/wpa_supplicant /etc/rc.d/watchdogd /etc/rc.d/syscons /etc/rc.d/sshd /etc/rc.d/sendmail /etc/rc.d/cron /etc/rc.d/jail /etc/rc.d/localpkg /etc/rc.d/securelevel /etc/rc.d/power_profile /etc/rc.d/othermta /etc/rc.d/natd /etc/rc.d/msgs /etc/rc.d/moused /etc/rc.d/mixer /etc/rc.d/inetd /etc/rc.d/idmapd /etc/rc.d/hostapd /etc/rc.d/geli2 /etc/rc.d/ftpd /etc/rc.d/ftp-proxy /etc/rc.d/dhclient /etc/rc.d/bsnmpd /etc/rc.d/bridge /etc/rc.d/bluetooth /etc/rc.d/bgfsck albert(7.1-P)[4] Peace, david -- David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081105/f50b2a20/attachment.pgp From koitsu at FreeBSD.org Wed Nov 5 08:02:08 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 5 08:02:15 2008 Subject: ukbd attachment and root mount In-Reply-To: <4911BA93.9030006@icyb.net.ua> References: <4911BA93.9030006@icyb.net.ua> Message-ID: <20081105160159.GA14488@icarus.home.lan> On Wed, Nov 05, 2008 at 05:24:03PM +0200, Andriy Gapon wrote: > System is FreeBSD 7.1-BETA2 amd64. > > Looking through my dmesg I see that relative order of ukbd attachment > and root mounting is not deterministic. Sometime keyboard is attached > first, sometimes root filesystem is mounted first. Quite more often root > is mounted first, though. > Example (with GENERIC kernel): > Nov 3 15:40:54 kernel: Trying to mount root from ufs:/dev/mirror/bootgm > Nov 3 15:40:54 kernel: GEOM_LABEL: Label ufs/bootfs removed. > Nov 3 15:40:54 kernel: GEOM_LABEL: Label for provider mirror/bootgm is > ufs/bootfs. > Nov 3 15:40:54 kernel: GEOM_LABEL: Label ufs/bootfs removed. > Nov 3 15:40:54 kernel: ukbd0: 1.10/1.10, addr 3> on uhub2 > Nov 3 15:40:54 kernel: kbd2 at ukbd0 > Nov 3 15:40:54 kernel: uhid0: 1.10/1.10, addr 3> on uhub2 > > Another (with custom kernel, zfs root): > Nov 4 17:54:03 odyssey kernel: Trying to mount root from zfs:tank/root > Nov 4 17:54:03 odyssey kernel: ukbd0: rev 1.10/1.10, addr 3> on uhub2 > Nov 4 17:54:03 odyssey kernel: kbd2 at ukbd0 > Nov 4 17:54:03 odyssey kernel: kbd2: ukbd0, generic (0), config:0x0, > flags:0x3d0000 > Nov 4 17:54:03 odyssey kernel: uhid0: rev 1.10/1.10, addr 3> on uhub2 I'm not understanding why the "order" matters per se, but I believe you might explain how it impacts you below. > I have a legacy-free system (no PS/2 ports, only USB) and I wanted to > try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was > bitten hard when I made a mistake and kernel could not find/mount root > filesystem. > > So I stuck at mountroot prompt without a keyboard to enter anything. > This was repeatable about 10 times after which I resorted to live cd. I've seen this problem myself many times. In fact, on my FreeBSD box at home, I ran into this last night. That system uses a USB keyboard, but has PS/2 ports if need be. This motherboard has PS/2 emulation for USB devices (often called "USB Legacy" in BIOSes), and that option was enabled; this, of course, allows me to use the USB keyboard in MS-DOS, boot0, and boot2/loader. I also had kbdmux(4) disabled via hint.kbdmux.0.disable="1" in loader.conf, but left atkbd/atkbdc in my kernel. (The reason I disabled kbdmux was because of the known 2-or-3-second-delay problem when switching virtual consoles) I encountered a situation last night where I needed to specify the root filesystem at the mountroot prompt, but typing didn't work. I was forced to plug in a PS/2 keyboard. I could reproduce this problem every time. I found that by re-enabling kbdmux (removing the "hint" entry), and instead disabling atkbd/atkbdc via a "hint" entry, solved this problem. So, let's recap the scenario, because people might be confused by what I'm describing: Configuration A --------------- * USB Legacy enabled in BIOS * Kernel config -- includes atkbd, atkbdc, kbdmux, and USB * loader.conf -- hint.kbdmux.0.disable="1" * Able to type in MS-DOS, boot0, boot2/loader * Able to type in multi-user mode * No delays when switching virtual consoles in multi-user mode * Unable to type at mountroot prompt Configuration B --------------- * USB Legacy enabled in BIOS * Kernel config -- includes atkbd, atkbdc, kbdmux, and USB * loader.conf -- hint.atkbd.0.disable="1" * loader.conf -- hint.atkbdc.0.disable="1" * Able to type in MS-DOS, boot0, boot2/loader * Able to type in multi-user mode * No delays when switching virtual consoles in multi-user mode * Able to type at mountroot prompt Draw your own conclusions. (NOTE: I have no idea if hint.atkbdc.0.disabled="1" actually does anything, but I assume hint.atkbd.0.disabled="1" does in fact do what I expect) > Since then I put back atkbdc into my kernel. I guess BIOS or USB > hardware emulate AT or PS/2 keyboard, so the USB keyboard works before > the driver attaches. Correct; "USB Legacy" will do this. "USB Legacy" will work properly up until the point the kernel loads (but BEFORE the USB stack loads). Once the USB stack loads, ukbd or kbdmux takes over. > I guess I need such emulation e.g. for loader or boot0 configuration. > But I guess I don't have to have atkbd driver in kernel. I would recommend not messing with the kernel at all for this. You can disable atkbd and kbdmux via loader.conf "hint" lines. The same goes for psm, AFAIK. > I wonder why behavior is non-deterministic, why ukbd is "99%" attached > after mount root, and what can be done about this. I'm not sure what to say here. But I will say that this situation is quite common, and makes doing *any* sort of administration on FreeBSD difficult -- look at my above situation. Had my motherboard lacked PS/2 ports, I would've been forced to hard reset the system. This is simply unacceptable. I don't know how to solve the problem, I don't know how to contribute code or fixes to help, and I don't know what users can do about it. The only thing I've found that "helps" is what I provided above. I want to help more, but I can't due to lack of knowledge -- and most users will be in this boat. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From freebsd at violetlan.net Wed Nov 5 10:09:44 2008 From: freebsd at violetlan.net (Reinhold) Date: Wed Nov 5 10:09:52 2008 Subject: Atheros AR5141 In-Reply-To: <20081104141532.GA1076@geosc.psu.edu> References: <53554.217.45.165.129.1225797313.squirrel@www.violetlan.net> <20081104141532.GA1076@geosc.psu.edu> Message-ID: <64819.217.45.165.129.1225908582.squirrel@www.violetlan.net> Hi Thanks for showing me that page, I've downloaded the latest drivers and had a look at the devid.h file and pretty much all the other files as well but I can't find any reference to the AR5414 chipset. The closest I've seen was AR5413 and AR5416 but not AR5414. Is there someone that is using this chipset? Thanks Reinhold > > The latest ath_hal is available from http://people.freebsd.org/~sam/ . > > > Tom > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > From avg at icyb.net.ua Wed Nov 5 10:39:17 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 5 10:39:25 2008 Subject: ukbd attachment and root mount In-Reply-To: <20081105160159.GA14488@icarus.home.lan> References: <4911BA93.9030006@icyb.net.ua> <20081105160159.GA14488@icarus.home.lan> Message-ID: <4911E852.8060100@icyb.net.ua> on 05/11/2008 18:01 Jeremy Chadwick said the following: > On Wed, Nov 05, 2008 at 05:24:03PM +0200, Andriy Gapon wrote: [snip] >> Another (with custom kernel, zfs root): >> Nov 4 17:54:03 odyssey kernel: Trying to mount root from zfs:tank/root >> Nov 4 17:54:03 odyssey kernel: ukbd0: > rev 1.10/1.10, addr 3> on uhub2 >> Nov 4 17:54:03 odyssey kernel: kbd2 at ukbd0 >> Nov 4 17:54:03 odyssey kernel: kbd2: ukbd0, generic (0), config:0x0, >> flags:0x3d0000 >> Nov 4 17:54:03 odyssey kernel: uhid0: > rev 1.10/1.10, addr 3> on uhub2 [snip] > Configuration B > --------------- > * USB Legacy enabled in BIOS > * Kernel config -- includes atkbd, atkbdc, kbdmux, and USB > * loader.conf -- hint.atkbd.0.disable="1" > * loader.conf -- hint.atkbdc.0.disable="1" > * Able to type in MS-DOS, boot0, boot2/loader > * Able to type in multi-user mode > * No delays when switching virtual consoles in multi-user mode > * Able to type at mountroot prompt > > Draw your own conclusions. Jeremy, thank you very much for your detailed reply. I am very curious to see if in this configuration you get ukbd lines before or after 'Trying to mount root' line in dmesg. (In other words it would be very puzzling for me to see that keyboard works at mountroot prompt even though ukbd driver hasn't attached yet.) Thank you again! -- Andriy Gapon From jrhett at netconsonance.com Wed Nov 5 15:30:08 2008 From: jrhett at netconsonance.com (Jo Rhett) Date: Wed Nov 5 15:30:18 2008 Subject: FreeBSD 6.4-RC2 available... In-Reply-To: <1225731426.42716.42.camel@bauer.cse.buffalo.edu> References: <1225731426.42716.42.camel@bauer.cse.buffalo.edu> Message-ID: <501B7A38-FCD7-4462-B9F2-BB893ED3559F@netconsonance.com> On Nov 3, 2008, at 8:57 AM, Ken Smith wrote: > The second Release Candidate for FreeBSD 6.4 is now available. > FreeBSD > 6.4-RC2 should be the last of the public test builds for the FreeBSD > 6.4 > release cycle. Unless a big show-stopper is found from this round of > testing we should begin the 6.4-RELEASE builds in about a week and a > half. We encourage you to test out 6.4-RC2 and report any problems by > submitting PRs or via email to the freebsd-stable list. I would love to, but it's still not available on the download area. Er, i386 isn't anyway. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From jrhett at netconsonance.com Wed Nov 5 15:41:17 2008 From: jrhett at netconsonance.com (Jo Rhett) Date: Wed Nov 5 15:41:24 2008 Subject: 6.4 RC1 locks up solid on first reboot In-Reply-To: <200810271151.47366.jhb@freebsd.org> References: <84E1EC10-5323-4A8C-AD60-31142621DB32@netconsonance.com> <200810271151.47366.jhb@freebsd.org> Message-ID: On Oct 27, 2008, at 8:51 AM, John Baldwin wrote: > On Friday 24 October 2008 02:48:13 pm Jo Rhett wrote: >> So I booted up by CD and used Fixit mode to switch the system to boot >> via serial (keyboard detached), but this gathered me even less. >> >> /boot.config: -Dh >> Consoles: internal video/keyboard serial port >> BIOS drive A: is disk0 >> BIOS drive C: is disk1 >> BIOS drive D: is disk2 >> BIOS 639kB/4062144kB available memory >> >> FreeBSD/i386 bootstrap loader, Revision 1.1 >> (root@dessler.cse.b >> >> Plugging back in the monitor after lockup showed only a single char >> more: >> (root@dessler.cse.bu > > This confirms it is hanging in one of the two BIOS routines to > output a > character. One thing you can do would be to boot up and do the > following: > > dd if=/dev/mem bs=0x400 count=1 of=idt.out > dd if=/dev/mem bs=64k iseek=15 count=1 of=bios.out > > Then place those files some place I can fetch them. Both files are at http://support.netconsonance.com/freebsd/ FYI, this is notable -- the keyboard does not respond at the boot prompt. I mean the menu where you can escape to the loader prompt, with the fat freebsd ascii art. No keyboard presses are observed here. This is also true for the boot menu on the 6.4 installation CD too. No problems with 6.2 or 6.3 -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From jrhett at netconsonance.com Wed Nov 5 15:42:40 2008 From: jrhett at netconsonance.com (Jo Rhett) Date: Wed Nov 5 15:42:46 2008 Subject: FreeBSD 6.4-RC2 available... In-Reply-To: <501B7A38-FCD7-4462-B9F2-BB893ED3559F@netconsonance.com> References: <1225731426.42716.42.camel@bauer.cse.buffalo.edu> <501B7A38-FCD7-4462-B9F2-BB893ED3559F@netconsonance.com> Message-ID: <126557AC-CE1B-4FFC-8367-9931AEE187E2@netconsonance.com> On Nov 5, 2008, at 3:29 PM, Jo Rhett wrote: > On Nov 3, 2008, at 8:57 AM, Ken Smith wrote: >> The second Release Candidate for FreeBSD 6.4 is now available. >> FreeBSD >> 6.4-RC2 should be the last of the public test builds for the >> FreeBSD 6.4 >> release cycle. Unless a big show-stopper is found from this round of >> testing we should begin the 6.4-RELEASE builds in about a week and a >> half. We encourage you to test out 6.4-RC2 and report any problems >> by >> submitting PRs or via email to the freebsd-stable list. > > I would love to, but it's still not available on the download area. > Er, i386 isn't anyway. Ignore this. Apparently Camino caches the results of FTP directories. You have to explicitly hit reload to get the directory again, even if the results are days or weeks old. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From marshc187 at gmail.com Wed Nov 5 16:10:49 2008 From: marshc187 at gmail.com (t-u-t) Date: Wed Nov 5 16:13:06 2008 Subject: usb mouse & problem reports (newbie) Message-ID: <332f78510811051543x4156df8coa342a21db069a1d9@mail.gmail.com> hello anyone, i have been playing with fbsd the past few months and have come across a minor problem and don't know if a problem report is in order, reported elsewhere then where i looked, or known issue, but i have noticed it in latest stable versions of both 7.1 and 6.4 (amd64 at least) and not sure if it is something to fix. the problem doesn't bother me much, just a bit curious about the issue and hope i can just to point it out here at the moment. basically i have a copperhead razor mouse which used to be detected as a keyboard and has been fixed is stable versions, but if you drop to single user , then , it freezes and i have to unplug it and plug it in again to work. sorry and disregard if fixed in latest (i have a 6.4 rc2 fresh from site ISOs and haven't csup'ed as i am having a hard time building worlds on amd64 systems and looking into that slowly) -- ce la vie From amon at aelita.org Thu Nov 6 01:30:33 2008 From: amon at aelita.org (Herve Boulouis) Date: Thu Nov 6 01:30:42 2008 Subject: Multiple panics with 7.1-PRERELEASE amd64 and varnish Message-ID: <20081106102931.GD596@ra.aabs> Hi, Just put 3 boxes with varnish into production last night and I've got a few differents panics in 8hours. Traffic is very low (each box generates 10 Mbit/s). All boxes have the same 7.1-PRERELEASE (from around end of september) and 4GB of ram and varnish is launched with the following command line : /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl \ -u varnish \ -n /opt/varnish \ -s file,/opt/varnish/storage.bin,4G \ -P /var/run/varnishd.pid \ -t 7464960000s \ -w 32,4096,120 \ -p listen_depth=4096 \ -p thread_pool_min=32 \ -p thread_pool_max=4096 \ -p lru_interval=3600 \ -T 0.0.0.0:3000 \ -h classic,500009 Sysctl tunings : www2:~# cat /etc/sysctl.conf net.inet.tcp.tso=0 debug.minidump=1 net.inet.tcp.rfc1323=1 kern.ipc.maxsockbuf=1024000 net.inet.tcp.sendspace=512000 net.inet.tcp.recvspace=512000 www2:~# cat /boot/loader.conf kern.ipc.nmbclusters=65536 kern.maxfiles=2097152 kern.maxfilesperproc=104856 kern.ipc.somaxconn=16384 net.inet.ip.portrange.last=65535 kern.threads.max_threads_per_proc=4096 kern.ipc.maxpipekva=104857600 What is strange is that those 3 boxes where hammered with siege and we had no problems with them generating 900Mbit/s of traffic. Are these panics known ? Can they be related to the sysctl ? Do i need to CC freebsd-amd64 ? Panic #1 (Box 2): #2 0xffffffff803c6b8d in boot (howto=260) at ../../../kern/kern_shutdown.c:418 #3 0xffffffff803c6e48 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:572 #4 0xffffffff806260b6 in trap_fatal (frame=0xffffffffcd816890, eva=138) at ../../../amd64/amd64/trap.c:764 #5 0xffffffff80626311 in trap_pfault (frame=0xffffffffcd816890, usermode=0) at ../../../amd64/amd64/trap.c:680 #6 0xffffffff80626c53 in trap (frame=0xffffffffcd816890) at ../../../amd64/amd64/trap.c:449 #7 0xffffffff8060d5be in calltrap () at ../../../amd64/amd64/exception.S:209 #8 0xffffffff80599dda in vm_object_clear_flag (object=0x0, bits=Variable "bits" is not available. ) at ../../../vm/vm_object.c:269 #9 0xffffffff804302a6 in cluster_wbuild (vp=0xffffff000eabfbd0, size=16384, start_lbn=58905, len=2) at ../../../kern/vfs_cluster.c:925 #10 0xffffffff804275de in vfs_bio_awrite (bp=0xffffffffa4d0b660) at ../../../kern/vfs_bio.c:1668 #11 0xffffffff8057800e in ffs_syncvnode (vp=0xffffff000eabfbd0, waitfor=Variable "waitfor" is not available. ) at ../../../ufs/ffs/ffs_vnops.c:283 #12 0xffffffff80573c91 in ffs_sync (mp=0xffffff000e5cc6f0, waitfor=3, td=0xffffff000e568a50) at ../../../ufs/ffs/ffs_vfsops.c:1234 #13 0xffffffff8043f821 in sync_fsync (ap=Variable "ap" is not available. ) at ../../../kern/vfs_subr.c:3217 #14 0xffffffff8065f552 in VOP_FSYNC_APV (vop=Variable "vop" is not available. ) at vnode_if.c:1007 #15 0xffffffff8043ff71 in sched_sync () at vnode_if.h:538 #16 0xffffffff803a6a01 in fork_exit (callout=0xffffffff8043f8f0 , arg=0x0, frame=0xffffffffcd816c80) at ../../../kern/kern_fork.c:804 #17 0xffffffff8060d98e in fork_trampoline () at ../../../amd64/amd64/exception.S:455 Panic #2 (Box 2): #2 0xffffffff803c6b8d in boot (howto=260) at ../../../kern/kern_shutdown.c:418 #3 0xffffffff803c6e48 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:572 #4 0xffffffff806260b6 in trap_fatal (frame=0xffffffffcd816890, eva=138) at ../../../amd64/amd64/trap.c:764 #5 0xffffffff80626311 in trap_pfault (frame=0xffffffffcd816890, usermode=0) at ../../../amd64/amd64/trap.c:680 #6 0xffffffff80626c53 in trap (frame=0xffffffffcd816890) at ../../../amd64/amd64/trap.c:449 #7 0xffffffff8060d5be in calltrap () at ../../../amd64/amd64/exception.S:209 #8 0xffffffff80599dda in vm_object_clear_flag (object=0x0, bits=Variable "bits" is not available. ) at ../../../vm/vm_object.c:269 #9 0xffffffff804302a6 in cluster_wbuild (vp=0xffffff000ea4adc8, size=16384, start_lbn=20817, len=5) at ../../../kern/vfs_cluster.c:925 #10 0xffffffff804275de in vfs_bio_awrite (bp=0xffffffffa4d83660) at ../../../kern/vfs_bio.c:1668 #11 0xffffffff8057800e in ffs_syncvnode (vp=0xffffff000ea4adc8, waitfor=Variable "waitfor" is not available. ) at ../../../ufs/ffs/ffs_vnops.c:283 #12 0xffffffff80573c91 in ffs_sync (mp=0xffffff000e6c96f0, waitfor=3, td=0xffffff000e568a50) at ../../../ufs/ffs/ffs_vfsops.c:1234 #13 0xffffffff8043f821 in sync_fsync (ap=Variable "ap" is not available. ) at ../../../kern/vfs_subr.c:3217 #14 0xffffffff8065f552 in VOP_FSYNC_APV (vop=Variable "vop" is not available. ) at vnode_if.c:1007 #15 0xffffffff8043ff71 in sched_sync () at vnode_if.h:538 #16 0xffffffff803a6a01 in fork_exit (callout=0xffffffff8043f8f0 , arg=0x0, frame=0xffffffffcd816c80) at ../../../kern/kern_fork.c:804 #17 0xffffffff8060d98e in fork_trampoline () at ../../../amd64/amd64/exception.S:455 Panic #3 (Box 1): #3 0xffffffff803c6e48 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:572 #4 0xffffffff8059d76b in vm_page_unwire (m=0xffffff00d9e3ff70, activate=0) at ../../../vm/vm_page.c:1410 #5 0xffffffff80429cc8 in vfs_vmio_release (bp=0xffffffffa4ca5860) at ../../../kern/vfs_bio.c:1539 #6 0xffffffff8042b72b in getnewbuf (slpflag=0, slptimeo=0, size=Variable "size" is not available. ) at ../../../kern/vfs_bio.c:1847 #7 0xffffffff8042ccbe in getblk (vp=0xffffff001c6493f0, blkno=0, size=2048, slpflag=Variable "slpflag" is not available. ) at ../../../kern/vfs_bio.c:2602 #8 0xffffffff8042d65c in breadn (vp=0xffffff001c6493f0, blkno=Variable "blkno" is not available. ) at ../../../kern/vfs_bio.c:786 #9 0xffffffff8042d759 in bread (vp=Variable "vp" is not available. ) at ../../../kern/vfs_bio.c:734 #10 0xffffffff80578c92 in ffs_read (ap=Variable "ap" is not available. ) at ../../../ufs/ffs/ffs_vnops.c:502 #11 0xffffffff8065f222 in VOP_READ_APV (vop=Variable "vop" is not available. ) at vnode_if.c:637 #12 0xffffffff805838b0 in ufs_readdir (ap=0xffffffffcfca4a80) at vnode_if.h:344 #13 0xffffffff8065f5d6 in VOP_READDIR_APV (vop=Variable "vop" is not available. ) at vnode_if.c:1407 #14 0xffffffff80448d6f in getdirentries (td=0xffffff000ef42000, uap=dwarf2_read_address: Corrupted DWARF expression. ) at vnode_if.h:747 #15 0xffffffff8062668d in syscall (frame=0xffffffffcfca4c80) at ../../../amd64/amd64/trap.c:907 #16 0xffffffff8060d7cb in Xfast_syscall () at ../../../amd64/amd64/exception.S:330 Panic #4 (Box 0): #1 0x0000000000000000 in ?? () #2 0xffffffff803c6b8d in boot (howto=260) at ../../../kern/kern_shutdown.c:418 #3 0xffffffff803c6e48 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:572 #4 0xffffffff806260b6 in trap_fatal (frame=0xffffffffcd8162a0, eva=138) at ../../../amd64/amd64/trap.c:764 #5 0xffffffff80626311 in trap_pfault (frame=0xffffffffcd8162a0, usermode=0) at ../../../amd64/amd64/trap.c:680 #6 0xffffffff80626c53 in trap (frame=0xffffffffcd8162a0) at ../../../amd64/amd64/trap.c:449 #7 0xffffffff8060d5be in calltrap () at ../../../amd64/amd64/exception.S:209 #8 0xffffffff80599dda in vm_object_clear_flag (object=0x0, bits=Variable "bits" is not available. ) at ../../../vm/vm_object.c:269 #9 0xffffffff804302a6 in cluster_wbuild (vp=0xffffff000ea991f8, size=16384, start_lbn=57442, len=2) at ../../../kern/vfs_cluster.c:925 #10 0xffffffff804305ff in cluster_write (vp=0xffffff000ea991f8, bp=0xffffffffa5394360, filesize=4294967296, seqcount=127) at ../../../kern/vfs_cluster.c:570 #11 0xffffffff8057897a in ffs_write (ap=0xffffffffcd8166a0) at ../../../ufs/ffs/ffs_vnops.c:771 #12 0xffffffff80660834 in VOP_WRITE_APV (vop=0xffffffff80869280, a=0xffffffffcd8166a0) at vnode_if.c:691 #13 0xffffffff805a4ee5 in vnode_pager_generic_putpages (vp=0xffffff000ea991f8, m=0xffffffffcd816860, bytecount=Variable "bytecount" is not available. ) at vnode_if.h:373 #14 0xffffffff80430c31 in vop_stdputpages (ap=Variable "ap" is not available. ) at ../../../kern/vfs_default.c:550 #15 0xffffffff8065fc36 in VOP_PUTPAGES_APV (vop=Variable "vop" is not available. ) at vnode_if.c:2189 #16 0xffffffff805a5085 in vnode_pager_putpages (object=0xffffff000e6b6d00, m=0xffffffffcd816860, count=2, sync=8, rtvals=0xffffffffcd8167e0) at vnode_if.h:1164 #17 0xffffffff8059f79e in vm_pageout_flush (mc=0xffffffffcd816860, count=2, flags=8) at vm_pager.h:147 #18 0xffffffff8059baab in vm_object_page_collect_flush (object=0xffffff000e6b6d00, p=0xffffff00db42f178, curgeneration=Variable "curgeneration" is not available. ) at ../../../vm/vm_object.c:973 #19 0xffffffff8059bed6 in vm_object_page_clean (object=0xffffff000e6b6d00, start=0, end=Variable "end" is not available. ) at ../../../vm/vm_object.c:865 #20 0xffffffff8043f4ee in vfs_msync (mp=0xffffff000e69e378, flags=2) at ../../../kern/vfs_subr.c:2995 #21 0xffffffff8043f80a in sync_fsync (ap=Variable "ap" is not available. ) at ../../../kern/vfs_subr.c:3216 #22 0xffffffff8065f552 in VOP_FSYNC_APV (vop=Variable "vop" is not available. ) at vnode_if.c:1007 #23 0xffffffff8043ff71 in sched_sync () at vnode_if.h:538 #24 0xffffffff803a6a01 in fork_exit (callout=0xffffffff8043f8f0 , arg=0x0, frame=0xffffffffcd816c80) at ../../../kern/kern_fork.c:804 #25 0xffffffff8060d98e in fork_trampoline () at ../../../amd64/amd64/exception.S:455 -- Herve Boulouis From amon at aelita.org Thu Nov 6 03:45:37 2008 From: amon at aelita.org (Herve Boulouis) Date: Thu Nov 6 03:45:44 2008 Subject: Multiple panics with 7.1-PRERELEASE amd64 and varnish In-Reply-To: <20081106102931.GD596@ra.aabs> References: <20081106102931.GD596@ra.aabs> Message-ID: <20081106124427.GE596@ra.aabs> Le 06/11/2008 11:29, Herve Boulouis a écrit: > > All boxes have the same 7.1-PRERELEASE (from around end of september) and 4GB of ram and varnish is launched with the following command line : I forgot to add that the kernel config is pretty much GENERIC (without KDTRACE_FRAME and KDTRACE_HOOKS) -- Herve Boulouis From avg at icyb.net.ua Thu Nov 6 04:34:51 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Thu Nov 6 04:38:02 2008 Subject: usb keyboard dying at loader prompt Message-ID: <4912E462.4090608@icyb.net.ua> I have a quite strange problem. This is with 7-BETA amd64. All of USB is out of kernel and is loaded via modules. BIOS has "Legacy USB" enabled. I have only a USB keyboard, no PS/2 port. The keyboard works file in BIOS and for selecting boot device in boot0 menu. It also works in loader menu. If in the menu I select to go to loader prompt then it works for about 5 seconds and then "dies" - no reaction to key presses, no led change, nothing. I haven't actually verified if the keyboard would still work if I stayed in loader menu for longer than ~10 seconds. This doesn't happen if USB is built into kernel. Weird... -- Andriy Gapon From bms at incunabulum.net Thu Nov 6 05:38:14 2008 From: bms at incunabulum.net (Bruce M Simpson) Date: Thu Nov 6 05:38:21 2008 Subject: Firefox 3 fraks postscript fonts Message-ID: <4912F343.6060908@incunabulum.net> Hi, I've noticed that with Firefox 3, output on my venerable PostScript printer uses the wrong fonts. A garbled bitmapped font is being substituted. If I revert to Firefox 2, printed output is fine. Haven't seen this with other apps. Seen with a networked cups driven printer, specifically GCC Technologies (ancient LaserWriter II compatible device). The remote printer server is 7.1-PRERELEASE w/cup 1.3.9. Connection is lpt0 (centronics). FreeBSD empiric.lon.incunabulum.net 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Mon Oct 13 09:54:07 BST 2008 bms@empiric.lon.incunabulum.net:/usr/obj/usr/src/sys/EMPIRIC7 i386 empiric:~ % pkg_info | grep firefox firefox-2.0.0.14,1 Web browser based on the browser portion of Mozilla firefox-3.0.1_1,1 Web browser based on the browser portion of Mozilla empiric:~ % pkg_info | grep cups cups-base-1.3.7_2 Common UNIX Printing System cups-pdf-2.4.7 A virtual printer for CUPS to produce PDF files cups-pstoraster-8.15.4_1 Postscript interpreter for CUPS printing to non-PS printers gutenprint-cups-5.1.7_2 GutenPrint Printer Driver libgnomecups-0.2.3,1 Support library for gnome cups admistration empiric:~ % ldd /usr/local/lib/firefox/firefox-bin /usr/local/lib/firefox/firefox-bin: libmozjs.so => /usr/local/lib/firefox/libmozjs.so (0x2808f000) libxpcom.so => /usr/local/lib/firefox/libxpcom.so (0x28138000) libxpcom_core.so => /usr/local/lib/firefox/libxpcom_core.so (0x2813c000) libplds4.so.1 => /usr/local/lib/libplds4.so.1 (0x281dd000) libplc4.so.1 => /usr/local/lib/libplc4.so.1 (0x28208000) libnspr4.so.1 => /usr/local/lib/libnspr4.so.1 (0x28234000) libgtk-x11-2.0.so.0 => /usr/local/lib/libgtk-x11-2.0.so.0 (0x28265000) libgdk-x11-2.0.so.0 => /usr/local/lib/libgdk-x11-2.0.so.0 (0x285dd000) libatk-1.0.so.0 => /usr/local/lib/libatk-1.0.so.0 (0x28662000) libgdk_pixbuf-2.0.so.0 => /usr/local/lib/libgdk_pixbuf-2.0.so.0 (0x2867b000) libpangocairo-1.0.so.0 => /usr/local/lib/libpangocairo-1.0.so.0 (0x28693000) libXext.so.6 => /usr/local/lib/libXext.so.6 (0x2869d000) libXrender.so.1 => /usr/local/lib/libXrender.so.1 (0x286aa000) libXinerama.so.1 => /usr/local/lib/libXinerama.so.1 (0x286b3000) libXi.so.6 => /usr/local/lib/libXi.so.6 (0x286b6000) libXrandr.so.2 => /usr/local/lib/libXrandr.so.2 (0x286be000) libXcursor.so.1 => /usr/local/lib/libXcursor.so.1 (0x286c4000) libXcomposite.so.1 => /usr/local/lib/libXcomposite.so.1 (0x286cd000) libXdamage.so.1 => /usr/local/lib/libXdamage.so.1 (0x286d0000) libcairo.so.2 => /usr/local/lib/libcairo.so.2 (0x286d3000) libpangoft2-1.0.so.0 => /usr/local/lib/libpangoft2-1.0.so.0 (0x28738000) libpango-1.0.so.0 => /usr/local/lib/libpango-1.0.so.0 (0x28760000) libfreetype.so.9 => /usr/local/lib/libfreetype.so.9 (0x2879c000) libz.so.4 => /lib/libz.so.4 (0x2880a000) libfontconfig.so.1 => /usr/local/lib/libfontconfig.so.1 (0x2881c000) libX11.so.6 => /usr/local/lib/libX11.so.6 (0x28845000) libXfixes.so.3 => /usr/local/lib/libXfixes.so.3 (0x2892a000) libgobject-2.0.so.0 => /usr/local/lib/libgobject-2.0.so.0 (0x2892f000) libgmodule-2.0.so.0 => /usr/local/lib/libgmodule-2.0.so.0 (0x28969000) libglib-2.0.so.0 => /usr/local/lib/libglib-2.0.so.0 (0x2896d000) libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x28a1c000) libgthread-2.0.so.0 => /usr/local/lib/libgthread-2.0.so.0 (0x28b11000) libm.so.5 => /lib/libm.so.5 (0x28b16000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x28b2b000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x28c20000) libthr.so.3 => /lib/libthr.so.3 (0x28c2b000) libc.so.7 => /lib/libc.so.7 (0x28c3e000) libXau.so.6 => /usr/local/lib/libXau.so.6 (0x28d40000) libglitz.so.1 => /usr/local/lib/libglitz.so.1 (0x28d43000) libpng.so.5 => /usr/local/lib/libpng.so.5 (0x28d69000) libpixman-1.so.9 => /usr/local/lib/libpixman-1.so.9 (0x28d8e000) libexpat.so.6 => /usr/local/lib/libexpat.so.6 (0x28db6000) libXdmcp.so.6 => /usr/local/lib/libXdmcp.so.6 (0x28dd6000) librpcsvc.so.3 => /usr/lib/librpcsvc.so.3 (0x28ddb000) libintl.so.8 => /usr/local/lib/libintl.so.8 (0x28de3000) libicui18n.so.38 => /usr/local/lib/libicui18n.so.38 (0x28dec000) libpcre.so.0 => /usr/local/lib/libpcre.so.0 (0x28f43000) libm.so.4 => /usr/local/lib/compat/libm.so.4 (0x28f6a000) libicuuc.so.38 => /usr/local/lib/libicuuc.so.38 (0x28f80000) libicudata.so.38 => /usr/local/lib/libicudata.so.38 (0x2909e000) empiric:~ % ldd /usr/local/lib/firefox3/firefox-bin /usr/local/lib/firefox3/firefox-bin: libxul.so => not found (0x0) libmozjs.so => not found (0x0) libxpcom.so => not found (0x0) libplds4.so.1 => /usr/local/lib/libplds4.so.1 (0x2807e000) libplc4.so.1 => /usr/local/lib/libplc4.so.1 (0x280a9000) libnspr4.so.1 => /usr/local/lib/libnspr4.so.1 (0x280d5000) libgtk-x11-2.0.so.0 => /usr/local/lib/libgtk-x11-2.0.so.0 (0x28106000) libatk-1.0.so.0 => /usr/local/lib/libatk-1.0.so.0 (0x28475000) libgdk-x11-2.0.so.0 => /usr/local/lib/libgdk-x11-2.0.so.0 (0x2848e000) libgdk_pixbuf-2.0.so.0 => /usr/local/lib/libgdk_pixbuf-2.0.so.0 (0x28513000) libpangocairo-1.0.so.0 => /usr/local/lib/libpangocairo-1.0.so.0 (0x28534000) libXext.so.6 => /usr/local/lib/libXext.so.6 (0x2853e000) libXrender.so.1 => /usr/local/lib/libXrender.so.1 (0x2854b000) libXinerama.so.1 => /usr/local/lib/libXinerama.so.1 (0x28554000) libXi.so.6 => /usr/local/lib/libXi.so.6 (0x28557000) libXrandr.so.2 => /usr/local/lib/libXrandr.so.2 (0x2855f000) libXcursor.so.1 => /usr/local/lib/libXcursor.so.1 (0x28565000) libXcomposite.so.1 => /usr/local/lib/libXcomposite.so.1 (0x2856e000) libXdamage.so.1 => /usr/local/lib/libXdamage.so.1 (0x28571000) libcairo.so.2 => /usr/local/lib/libcairo.so.2 (0x28574000) libpangoft2-1.0.so.0 => /usr/local/lib/libpangoft2-1.0.so.0 (0x285d9000) libpango-1.0.so.0 => /usr/local/lib/libpango-1.0.so.0 (0x28601000) libfreetype.so.9 => /usr/local/lib/libfreetype.so.9 (0x2863d000) libz.so.4 => /lib/libz.so.4 (0x286ab000) libfontconfig.so.1 => /usr/local/lib/libfontconfig.so.1 (0x286bd000) libgmodule-2.0.so.0 => /usr/local/lib/libgmodule-2.0.so.0 (0x286e6000) libX11.so.6 => /usr/local/lib/libX11.so.6 (0x286ea000) libXfixes.so.3 => /usr/local/lib/libXfixes.so.3 (0x287cf000) libgobject-2.0.so.0 => /usr/local/lib/libgobject-2.0.so.0 (0x287d4000) libglib-2.0.so.0 => /usr/local/lib/libglib-2.0.so.0 (0x2880e000) libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x288bd000) libm.so.5 => /lib/libm.so.5 (0x289b2000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x289c7000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x28abc000) libthr.so.3 => /lib/libthr.so.3 (0x28ac7000) libc.so.7 => /lib/libc.so.7 (0x28ada000) libXau.so.6 => /usr/local/lib/libXau.so.6 (0x28bdc000) libglitz.so.1 => /usr/local/lib/libglitz.so.1 (0x28bdf000) libpng.so.5 => /usr/local/lib/libpng.so.5 (0x28c05000) libpixman-1.so.9 => /usr/local/lib/libpixman-1.so.9 (0x28c2a000) libexpat.so.6 => /usr/local/lib/libexpat.so.6 (0x28c52000) libintl.so.8 => /usr/local/lib/libintl.so.8 (0x28c72000) libXdmcp.so.6 => /usr/local/lib/libXdmcp.so.6 (0x28c7b000) librpcsvc.so.3 => /usr/lib/librpcsvc.so.3 (0x28c80000) libicui18n.so.38 => /usr/local/lib/libicui18n.so.38 (0x28c88000) libpcre.so.0 => /usr/local/lib/libpcre.so.0 (0x28ddf000) libm.so.4 => /usr/local/lib/compat/libm.so.4 (0x28e06000) libicuuc.so.38 => /usr/local/lib/libicuuc.so.38 (0x28e1c000) libicudata.so.38 => /usr/local/lib/libicudata.so.38 (0x28f3a000) cheers BMS From amon at aelita.org Thu Nov 6 06:06:58 2008 From: amon at aelita.org (Herve Boulouis) Date: Thu Nov 6 06:07:05 2008 Subject: Multiple panics with 7.1-PRERELEASE amd64 and varnish In-Reply-To: <20081106102931.GD596@ra.aabs> References: <20081106102931.GD596@ra.aabs> Message-ID: <20081106150548.GF596@ra.aabs> Le 06/11/2008 11:29, Herve Boulouis a écrit: I just tried to reboot one of the boxes without kern.ipc.maxpipekva=104857600 to check for kva problems but crashes persists, though the stack is completely different now. This time I included all the corrupt parts of the stack that I had stripped in my original email but they are similar (from frame 18 to end). Any ideas ? Unread portion of the kernel message buffer: vm_page_free: pindex(188034), busy(1), VPO_BUSY(0), hold(0) panic: vm_page_free: freeing busy page cpuid = 2 Uptime: 1h1m2s Physical memory: 4085 MB Dumping 289 MB: 274 258 242 226 210 194 178 162 146 130 114 98 82 66 50 34 18 2 #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:195 #1 0x0000000000000000 in ?? () #2 0xffffffff803c6b8d in boot (howto=260) at ../../../kern/kern_shutdown.c:418 #3 0xffffffff803c6e48 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:572 #4 0xffffffff8059d816 in vm_page_free_toq (m=0x0) at ../../../vm/vm_page.c:1281 #5 0xffffffff8059d9c1 in vm_page_free (m=Variable "m" is not available. ) at ../../../vm/vm_page.c:498 #6 0xffffffff80411795 in socow_iodone (addr=Variable "addr" is not available. ) at ../../../kern/uipc_cow.c:92 #7 0xffffffff804129ca in mb_free_ext (m=0xffffff000ba64c00) at ../../../kern/uipc_mbuf.c:257 #8 0xffffffff80416781 in sbdrop_internal (sb=0xffffff000b4b2cc8, len=15469) at mbuf.h:515 #9 0xffffffff804168c0 in sbdrop_locked (sb=Variable "sb" is not available. ) at ../../../kern/uipc_sockbuf.c:898 #10 0xffffffff80418e05 in soisdisconnected (so=0xffffff000b4b2b40) at ../../../kern/uipc_socket.c:3158 #11 0xffffffff804e15b3 in tcp_close (tp=0xffffff000b6a8888) at ../../../netinet/tcp_subr.c:782 #12 0xffffffff804e16fa in tcp_drop (tp=0xffffff000b6a8888, errno=60) at ../../../netinet/tcp_subr.c:662 #13 0xffffffff804e65c2 in tcp_timer_rexmt (xtp=Variable "xtp" is not available. ) at ../../../netinet/tcp_timer.c:455 #14 0xffffffff803d81a3 in softclock (dummy=Variable "dummy" is not available. ) at ../../../kern/kern_timeout.c:274 #15 0xffffffff803a9a91 in ithread_loop (arg=Variable "arg" is not available. ) at ../../../kern/kern_intr.c:1088 #16 0xffffffff803a6a01 in fork_exit (callout=0xffffffff803a98d8 , arg=0xffffff00010fcb80, frame=0xffffffffad853c80) at ../../../kern/kern_fork.c:804 #17 0xffffffff8060d98e in fork_trampoline () at ../../../amd64/amd64/exception.S:455 #18 0x0000000000000000 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000001 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #49 0xffffff000110e370 in ?? () #50 0xffffffff803e5d68 in sched_switch (td=0xffffffff803a98d8, newtd=Variable "newtd" is not available. ) at ../../../kern/sched_ule.c:1938 #51 0x0000000000000000 in ?? () #52 0x0000000000000000 in ?? () #53 0x0000000000000000 in ?? () #54 0x0000000000000000 in ?? () #55 0x0000000000000000 in ?? () #56 0x0000000000000000 in ?? () #57 0x0000000000000000 in ?? () #58 0x0000000000000000 in ?? () #59 0x0000000000000000 in ?? () #60 0x0000000000000000 in ?? () #61 0x0000000000000000 in ?? () #62 0x0000000000000000 in ?? () #63 0x0000000000000000 in ?? () #64 0x0000000000000000 in ?? () #65 0x0000000000000000 in ?? () #66 0x0000000000000000 in ?? () #67 0x0000000000000000 in ?? () #68 0x0000000000000000 in ?? () #69 0x0000000000000000 in ?? () #70 0x0000000000000000 in ?? () ---Type to continue, or q to quit--- #71 0x0000000000000000 in ?? () #72 0x0000000000000000 in ?? () #73 0x0000000000000000 in ?? () #74 0x0000000000000000 in ?? () #75 0x0000000000000000 in ?? () #76 0x0000000000000000 in ?? () #77 0x0000000000000000 in ?? () #78 0x0000000000000000 in ?? () #79 0x0000000000000000 in ?? () #80 0x0000000000000000 in ?? () #81 0x0000000000000000 in ?? () #82 0x0000000000000000 in ?? () #83 0x0000000000000000 in ?? () #84 0x0000000000000000 in ?? () #85 0x0000000000000000 in ?? () #86 0x0000000000000000 in ?? () #87 0x0000000000000000 in ?? () #88 0x0000000000000000 in ?? () #89 0x0000000000000000 in ?? () #90 0x0000000000000000 in ?? () #91 0x0000000000000000 in ?? () #92 0x0000000000000000 in ?? () #93 0x0000000000000000 in ?? () #94 0x0000000000000000 in ?? () #95 0x0000000000000000 in ?? () #96 0x0000000000000000 in ?? () #97 0x0000000000000000 in ?? () #98 0x0000000000000000 in ?? () #99 0x0000000000000000 in ?? () #100 0x0000000000000000 in ?? () #101 0x0000000000000000 in ?? () #102 0x0000000000000000 in ?? () #103 0x0000000000000000 in ?? () #104 0x0000000000000000 in ?? () #105 0x0000000000000000 in ?? () #106 0x0000000000000000 in ?? () #107 0x0000000000000000 in ?? () #108 0x0000000000000000 in ?? () #109 0x0000000000000000 in ?? () #110 0x0000000000000000 in ?? () #111 0x0000000000000000 in ?? () #112 0x0000000000000000 in ?? () #113 0x0000000000000000 in ?? () #114 0x0000000000000000 in ?? () #115 0x0000000000000000 in ?? () #116 0x0000000000000000 in ?? () #117 0x0000000000000000 in ?? () #118 0x0000000000000000 in ?? () Cannot access memory at address 0xffffffffad854000 (kgdb) -- Herve Boulouis From gabriele.cecchetti at gmail.com Thu Nov 6 11:22:53 2008 From: gabriele.cecchetti at gmail.com (Gabriele Cecchetti) Date: Thu Nov 6 11:22:59 2008 Subject: Panics and freeze using age0 Message-ID: <49133B35.9040603@sssup.it> I'm running FreeBSD 7.1-PRERELEASE over Asus P5Q motherboard. Such board comes with an nic requiring age0 driver. When the nic is under load, such trasferring big amount of data over gigabit connection (let's say some gigabyte backups), the computer hangs, sometimes with panic, sometimes freezing. Before that i was using an Intel nic with em0 driver and 7.0-Stable#2.. and that configuration was really stable. Looking at the backtrace we found that the page fault happened during pmap operations (sorry I've not the backtrace anymore to attach to this mail). Another problem with age0 driver was that when the machine boot, the nic goes up and down and then remain down. Everytime, gaining control of it was a mess. Anyone with similar experience ? age0 driver is well tested for production enviroment or not ? Is better to buy another Intel card and going on using em0 driver ? Any other problems affecting this release just now ? And, last question, 7.1-PRERELEASE should belong to the STABLE tree ? In that case why during the last two months is so instable for production enviroment ? Maybe I'm wrong, so I accept suggestions. Cheers Gabriele From tinderbox at freebsd.org Thu Nov 6 16:47:37 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Thu Nov 6 16:47:44 2008 Subject: [releng_7 tinderbox] failure on i386/i386 Message-ID: <20081107004733.D48CE1B5078@freebsd-stable.sentex.ca> TB --- 2008-11-06 23:29:29 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-06 23:29:29 - starting RELENG_7 tinderbox run for i386/i386 TB --- 2008-11-06 23:29:29 - cleaning the object tree TB --- 2008-11-06 23:29:52 - cvsupping the source tree TB --- 2008-11-06 23:29:52 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/i386/i386/supfile TB --- 2008-11-06 23:30:01 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-06 23:30:01 - cd /src TB --- 2008-11-06 23:30:01 - /usr/bin/make -B buildworld >>> World build started on Thu Nov 6 23:30:02 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 00:36:54 UTC 2008 TB --- 2008-11-07 00:36:54 - generating LINT kernel config TB --- 2008-11-07 00:36:54 - cd /src/sys/i386/conf TB --- 2008-11-07 00:36:54 - /usr/bin/make -B LINT TB --- 2008-11-07 00:36:54 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 00:36:54 - cd /src TB --- 2008-11-07 00:36:54 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 00:36:54 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_poll.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 00:47:33 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 00:47:33 - ERROR: failed to build lint kernel TB --- 2008-11-07 00:47:33 - tinderbox aborted TB --- 3811.96 user 399.60 system 4684.00 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-i386-i386.full From tinderbox at freebsd.org Thu Nov 6 18:03:45 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Thu Nov 6 18:03:52 2008 Subject: [releng_7 tinderbox] failure on i386/pc98 Message-ID: <20081107020342.84FCE1B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 00:47:34 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 00:47:34 - starting RELENG_7 tinderbox run for i386/pc98 TB --- 2008-11-07 00:47:34 - cleaning the object tree TB --- 2008-11-07 00:47:57 - cvsupping the source tree TB --- 2008-11-07 00:47:57 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/i386/pc98/supfile TB --- 2008-11-07 00:48:07 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 00:48:07 - cd /src TB --- 2008-11-07 00:48:07 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 00:48:09 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 01:55:01 UTC 2008 TB --- 2008-11-07 01:55:01 - generating LINT kernel config TB --- 2008-11-07 01:55:01 - cd /src/sys/pc98/conf TB --- 2008-11-07 01:55:01 - /usr/bin/make -B LINT TB --- 2008-11-07 01:55:01 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 01:55:01 - cd /src TB --- 2008-11-07 01:55:01 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 01:55:01 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_poll.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/pc98/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 02:03:42 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 02:03:42 - ERROR: failed to build lint kernel TB --- 2008-11-07 02:03:42 - tinderbox aborted TB --- 3681.09 user 399.78 system 4568.36 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-i386-pc98.full From zbeeble at gmail.com Thu Nov 6 18:27:30 2008 From: zbeeble at gmail.com (Zaphod Beeblebrox) Date: Thu Nov 6 18:27:36 2008 Subject: zpool crash. Message-ID: <5f67a8c40811061827g46fe16adpb0c9d9c11db49dbe@mail.gmail.com> Someone posted something like this earlier... [3:6:306]root@canoe:/u/dgilbert> zpool attach canoe ad8s4d ad4s3d Assertion failed: (?Q), function rv == 0, file /canoe/64/usr/src/cddl/sbin/zpool/../../../cddl/contrib/opensolaris/cmd/zpool/zpool_vdev.c, line 131. Abort trap: 6 (core dumped) ... and the only reply I see is asking the poster to paste the given XML. Since I'm also getting this crash, I will post that here. From tinderbox at freebsd.org Thu Nov 6 18:39:11 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Thu Nov 6 18:39:18 2008 Subject: [releng_7 tinderbox] failure on ia64/ia64 Message-ID: <20081107023908.B75DD1B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 00:57:15 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 00:57:15 - starting RELENG_7 tinderbox run for ia64/ia64 TB --- 2008-11-07 00:57:16 - cleaning the object tree TB --- 2008-11-07 00:57:33 - cvsupping the source tree TB --- 2008-11-07 00:57:33 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/ia64/ia64/supfile TB --- 2008-11-07 00:57:40 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 00:57:40 - cd /src TB --- 2008-11-07 00:57:40 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 00:57:41 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 02:27:54 UTC 2008 TB --- 2008-11-07 02:27:54 - generating LINT kernel config TB --- 2008-11-07 02:27:54 - cd /src/sys/ia64/conf TB --- 2008-11-07 02:27:54 - /usr/bin/make -B LINT TB --- 2008-11-07 02:27:54 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 02:27:54 - cd /src TB --- 2008-11-07 02:27:54 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 02:27:54 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/kern/kern_mutex.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/ia64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 02:39:08 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 02:39:08 - ERROR: failed to build lint kernel TB --- 2008-11-07 02:39:08 - tinderbox aborted TB --- 5137.85 user 401.19 system 6112.66 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-ia64-ia64.full From tinderbox at freebsd.org Thu Nov 6 19:20:20 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Thu Nov 6 19:20:34 2008 Subject: [releng_7 tinderbox] failure on powerpc/powerpc Message-ID: <20081107032017.C6DE41B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 02:03:42 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 02:03:42 - starting RELENG_7 tinderbox run for powerpc/powerpc TB --- 2008-11-07 02:03:42 - cleaning the object tree TB --- 2008-11-07 02:03:59 - cvsupping the source tree TB --- 2008-11-07 02:03:59 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/powerpc/powerpc/supfile TB --- 2008-11-07 02:04:07 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 02:04:07 - cd /src TB --- 2008-11-07 02:04:07 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 02:04:08 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 03:12:24 UTC 2008 TB --- 2008-11-07 03:12:24 - generating LINT kernel config TB --- 2008-11-07 03:12:24 - cd /src/sys/powerpc/conf TB --- 2008-11-07 03:12:24 - /usr/bin/make -B LINT TB --- 2008-11-07 03:12:24 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 03:12:24 - cd /src TB --- 2008-11-07 03:12:24 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 03:12:25 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -fno-omit-frame-pointer -msoft-float -ffreestanding -Werror /src/sys/kern/kern_mutex.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -fno-omit-frame-pointer -msoft-float -ffreestanding -Werror /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -fno-omit-frame-pointer -msoft-float -ffreestanding -Werror /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -fno-omit-frame-pointer -msoft-float -ffreestanding -Werror /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -fno-omit-frame-pointer -msoft-float -ffreestanding -Werror /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -fno-omit-frame-pointer -msoft-float -ffreestanding -Werror /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/powerpc/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 03:20:17 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 03:20:17 - ERROR: failed to build lint kernel TB --- 2008-11-07 03:20:17 - tinderbox aborted TB --- 3780.38 user 376.19 system 4595.02 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-powerpc-powerpc.full From tinderbox at freebsd.org Thu Nov 6 19:50:08 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Thu Nov 6 19:50:27 2008 Subject: [releng_7 tinderbox] failure on sparc64/sparc64 Message-ID: <20081107035005.825A21B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 02:39:08 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 02:39:08 - starting RELENG_7 tinderbox run for sparc64/sparc64 TB --- 2008-11-07 02:39:08 - cleaning the object tree TB --- 2008-11-07 02:39:22 - cvsupping the source tree TB --- 2008-11-07 02:39:22 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/sparc64/sparc64/supfile TB --- 2008-11-07 02:39:30 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 02:39:30 - cd /src TB --- 2008-11-07 02:39:30 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 02:39:31 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 03:42:34 UTC 2008 TB --- 2008-11-07 03:42:34 - generating LINT kernel config TB --- 2008-11-07 03:42:34 - cd /src/sys/sparc64/conf TB --- 2008-11-07 03:42:34 - /usr/bin/make -B LINT TB --- 2008-11-07 03:42:34 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 03:42:34 - cd /src TB --- 2008-11-07 03:42:34 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 03:42:34 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -Werror /src/sys/kern/kern_mutex.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -Werror /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -Werror /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -Werror /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -Werror /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -Werror /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/sparc64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 03:50:05 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 03:50:05 - ERROR: failed to build lint kernel TB --- 2008-11-07 03:50:05 - tinderbox aborted TB --- 3572.13 user 364.69 system 4256.62 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-sparc64-sparc64.full From rorya+freebsd.org at TrueStep.com Thu Nov 6 21:16:19 2008 From: rorya+freebsd.org at TrueStep.com (Rory Arms) Date: Thu Nov 6 21:16:51 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime Message-ID: <9592E887-75F3-473F-9581-F9C22A9936A6@TrueStep.com> Ken, First of all, I'm not subscribed to the freebsd-stable@ mailing list, so please include me in any replies. I have been using a Dell Pentium II Inspiron 3700 (192 MiB of RAM, 430 Mhz PII-Celeron CPU) to test FreeBSD 6.4 and have been noticing panics since upgrading it to 6.4-RC1. The system initially was installed with 6.3-RELEASE (single partition installation, ACPI enabled, GNOME 2) and had been quite stable. So, in an effort to help test 6.4, I upgraded it to 6.4-RC1 using freebsd- update(8). The upgrade has destabilized the system, as every single bootup since then has led to a panic. I had coredumps turned on, so I tried to gather a backtrace with kgdb(1) and every time I'd try it would hang trying to parse the core dump. All I see is 4 lines of "Attempt to extract a compoent of a value that is not a structure pointer." and it hangs. So, today I decided to upgrade to 6.4-RC2 in hopes that it would resolve these problems and it also crashes. Though it does seem to last longer before panicing. With RC1 I couldn't even get a full login to GNOME to finish before the panic, most times. Now, with RC2 it seems to crash later, but this could just be coincidence. So far I've just had one crash with RC2, and am now running a few apps in GNOME to see if I can trigger it again. Note also, with RC2 I'm unable to analyze the kernel core dump either. kgdb(1) seems to hang with the same error as with RC1. I do have minidump on with RC2, and I've just now disabled it so that next time it crashes, it will save a full core file, to see if that makes a difference. Though, if RC1 is any indication, I doubt that will make any difference, but we'll see. All I can glean from the panic (from the info.0 file) is that the Panic String is a "page fault." I believe this was the same error with RC1 and as I recall every panic I was able to see (at the console) it involved the process named "swi6." Guess I should try booting with ACPI disabled to see if there is any difference, though I never had to do this with 6.3. Well, if I can assist with further debugging, let me know. Thanks, - rory From koitsu at FreeBSD.org Thu Nov 6 23:17:55 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Thu Nov 6 23:18:08 2008 Subject: Western Digital hard disks and ATA timeouts Message-ID: <20081107071752.GA5842@icarus.home.lan> A user and myself on a broadband forum were discussing the possibility of diminishing quality of hard disks (particularly 1TB models) in recent days (specifically October). The user continually referenced something called "deep recovery cycle", backed with claims from Newegg reviewers (who often know very little or nothing at all -- grain of salt concept applies), which make Western Digital's desktop hard disks unfit for RAID or server usage. I claimed shenanigans until the user pointed me to the following document on Western Digital's site: http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1397 The feature described apparently causes the hard disk to enter some form of aggressive sector scan/sector remapping loop, which can take up to 2 minutes to complete, during which time, the hard disk is basically unusable. (I imagine ATA commands sent to the disk will simply time out or stall indefinitely, which would result in all sorts of timeout errors). Note that Western Digital's "RAID edition" drives claim to take up to 7 seconds to reallocate sectors, using something they call TLER, which force-limits the amount of time the drive can spend reallocating. TLER cannot be disabled: http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1478 What baffles me is why Western Digital thinks that 2 minutes of the drive being unusable is acceptable "but only for desktops". Any FreeBSD desktop will start reporting ATA timeouts if the drive wedges for more than 5 seconds -- two minutes would just spew errors and hard-lock the system. What also baffles me is why Western Digital thinks the term "RAID" always means a hardware RAID controller is involved as a buffer between the OS and the disks. Bzzzt, bad assumption on their part. So why do we care? As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, and is not adjustable without editing the ATA code yourself and increasing the value. The FreeNAS folks have made patches available to turn the timeout value into a sysctl. Soren and/or others, please increase this timeout value. Five seconds has now been deemed too aggressive a default. And please consider migrating the timeout value into a sysctl. P.S. -- I do not consider any of this reason to avoid Western Digital drives. But I would warn users to be a little more cautious before reporting ATA timeouts when newer (circia 2007 and later) WD drives are in use. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From pyunyh at gmail.com Thu Nov 6 23:24:07 2008 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Thu Nov 6 23:24:14 2008 Subject: Panics and freeze using age0 In-Reply-To: <49133B35.9040603@sssup.it> References: <49133B35.9040603@sssup.it> Message-ID: <20081107072201.GD11486@cdnetworks.co.kr> On Thu, Nov 06, 2008 at 10:45:09AM -0800, Gabriele Cecchetti wrote: > I'm running FreeBSD 7.1-PRERELEASE over Asus P5Q motherboard. > Such board comes with an nic requiring age0 driver. > When the nic is under load, such trasferring big amount of data > over gigabit connection (let's say some gigabyte backups), > the computer hangs, sometimes with panic, sometimes freezing. > Without backtrace is really hard to guess what's the cause of panic. So please send backtrace to us. > Before that i was using an Intel nic with em0 driver > and 7.0-Stable#2.. and that configuration was really stable. > > Looking at the backtrace we found that the page fault happened > during pmap operations (sorry I've not the backtrace anymore to attach > to this mail). > Another problem with age0 driver was that when the machine boot, the nic > goes up and down and then remain down. Everytime, gaining control of it > was a mess. > Would you be more elaborate on this issue? Does it mean you have to manually up/down game with ifconfig(8) after boot? > Anyone with similar experience ? > age0 driver is well tested for production enviroment or not ? 7.1-RELEASE would be the first official release that will ship age(4). So I don't think it was well tested under various workloads. But if you can provide more information for the issue I think it could be enhanced in near future. > Is better to buy another Intel card and going on using em0 driver ? Remember, all hardwares supported by age(4) are for consumer motherboards. Even if L1 has much better performance/design than that of its successor, L1E, it's still not for server market. The same is true for em(4). Since there are too many variants you may have to choose the best hardware model that is suited for your workload. > Any other problems affecting this release just now ? > > And, last question, 7.1-PRERELEASE should belong to the STABLE tree ? > In that case why during the last two months is so instable for > production enviroment ? > > Maybe I'm wrong, so I accept suggestions. -- Regards, Pyun YongHyeon From gabriele.cecchetti at gmail.com Thu Nov 6 23:34:52 2008 From: gabriele.cecchetti at gmail.com (Gabriele Cecchetti) Date: Thu Nov 6 23:34:59 2008 Subject: Panics and freeze using age0 In-Reply-To: <20081107072201.GD11486@cdnetworks.co.kr> References: <49133B35.9040603@sssup.it> <20081107072201.GD11486@cdnetworks.co.kr> Message-ID: <4913EF93.6010006@sssup.it> Pyun YongHyeon ha scritto: [...] > Would you be more elaborate on this issue? Does it mean you have > to manually up/down game with ifconfig(8) after boot? Yes, I need to do that. Sometimes happen that even if the interface is showed during the bootlog, then it is not showed by ifconfig. Typing ifconfig again show the interface. Then I've to down/up the interface and/or ping outside, like the interface was sleeping. > > Anyone with similar experience ? > > age0 driver is well tested for production enviroment or not ? > [...] > > Is better to buy another Intel card and going on using em0 driver ? > > Remember, all hardwares supported by age(4) are for consumer > motherboards. Even if L1 has much better performance/design than > that of its successor, L1E, it's still not for server market. > The same is true for em(4). Since there are too many variants you > may have to choose the best hardware model that is suited for your > workload. The other card I'm using is: em0@pci0:5:1:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'PRO/1000 GT' class = network subclass = ethernet So, which card is the best buy for the server market ? Cheers Gabriele From fbsdlist at src.cx Fri Nov 7 00:08:04 2008 From: fbsdlist at src.cx (Artem Belevich) Date: Fri Nov 7 00:08:11 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: <20081107071752.GA5842@icarus.home.lan> References: <20081107071752.GA5842@icarus.home.lan> Message-ID: > Note that Western Digital's "RAID edition" drives claim to take up to 7 > seconds to reallocate sectors, using something they call TLER, which > force-limits the amount of time the drive can spend reallocating. TLER > cannot be disabled: TLER can be enabled/disabled on recent WD drives (SE16/RE2/GP). SE16/GP come with TLER off, RE2 with TLER on. Google WDTLER utility. It can apparently be obtained from WD by asking them nicely. Or, yet again, google is your friend. Here's one example - http://www.hardforum.com/archive/index.php/t-1191548.html --Artem From barbara.xxx1975 at libero.it Fri Nov 7 00:10:26 2008 From: barbara.xxx1975 at libero.it (barbara.xxx1975@libero.it) Date: Fri Nov 7 00:11:23 2008 Subject: R: 6.4-RC2 crashes after a few minutes of uptime Message-ID: <31468697.1064671226045407700.JavaMail.defaultUser@defaultHost> >Ken, > >First of all, I'm not subscribed to the freebsd-stable@ mailing list, >so please include me in any replies. I have been using a Dell Pentium >II Inspiron 3700 (192 MiB of RAM, 430 Mhz PII-Celeron CPU) to test >FreeBSD 6.4 and have been noticing panics since upgrading it to 6.4-RC1. > >The system initially was installed with 6.3-RELEASE (single partition >installation, ACPI enabled, GNOME 2) and had been quite stable. So, in >an effort to help test 6.4, I upgraded it to 6.4-RC1 using freebsd- >update(8). The upgrade has destabilized the system, as every single >bootup since then has led to a panic. I had coredumps turned on, so I >tried to gather a backtrace with kgdb(1) and every time I'd try it >would hang trying to parse the core dump. All I see is 4 lines of >"Attempt to extract a compoent of a value that is not a structure >pointer." and it hangs. > >So, today I decided to upgrade to 6.4-RC2 in hopes that it would >resolve these problems and it also crashes. Though it does seem to >last longer before panicing. With RC1 I couldn't even get a full login >to GNOME to finish before the panic, most times. Now, with RC2 it >seems to crash later, but this could just be coincidence. So far I've >just had one crash with RC2, and am now running a few apps in GNOME to >see if I can trigger it again. > >Note also, with RC2 I'm unable to analyze the kernel core dump either. >kgdb(1) seems to hang with the same error as with RC1. I do have >minidump on with RC2, and I've just now disabled it so that next time >it crashes, it will save a full core file, to see if that makes a >difference. Though, if RC1 is any indication, I doubt that will make >any difference, but we'll see. All I can glean from the panic (from >the info.0 file) is that the Panic String is a "page fault." I believe >this was the same error with RC1 and as I recall every panic I was >able to see (at the console) it involved the process named "swi6." >Guess I should try booting with ACPI disabled to see if there is any >difference, though I never had to do this with 6.3. > >Well, if I can assist with further debugging, let me know. > >Thanks, > >- rory Hello, I had a similar problem described on this thread http://lists.freebsd.org/pipermail/freebsd-stable/2008- October/045865.html Summarizing, I had several panics about swi6 (but I had them after some/several hours of uptime). I'm tracking STABLE and generally I'm resyncing /usr/src every 1-2 weeks and I started having problems since about Oct. 5. Unfortunately I had no answer. The problem *seems* gone away after taking the following actions: - disable powerd (I had enabled it few days before the problem emerged) - getting new sources and doing a new buildworld - rebuilding some gnome ports (the "non usual" ones I was using when panics occurred) and all the gnome deamons (sysutils/hal etc.) - as the guy having a similar problem on 7 "solved" moving away from gnome I'm not sure about what has been the resolutive action (and even if I had no more panics, I'm not really sure that the problem could be considered solved!) I hope that it could help about your stability problem... regards Barbara From koitsu at FreeBSD.org Fri Nov 7 00:36:30 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Fri Nov 7 00:36:39 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: References: <20081107071752.GA5842@icarus.home.lan> Message-ID: <20081107083626.GA1583@icarus.home.lan> On Fri, Nov 07, 2008 at 12:08:01AM -0800, Artem Belevich wrote: > > Note that Western Digital's "RAID edition" drives claim to take up to 7 > > seconds to reallocate sectors, using something they call TLER, which > > force-limits the amount of time the drive can spend reallocating. TLER > > cannot be disabled: > > TLER can be enabled/disabled on recent WD drives (SE16/RE2/GP). SE16/GP > come with TLER off, RE2 with TLER on. Google WDTLER utility. > It can apparently be obtained from WD by asking them nicely. > Or, yet again, google is your friend. Here's one example - > http://www.hardforum.com/archive/index.php/t-1191548.html Thanks for the information. Nice to know one of their FAQ entries is false. Also, note that "SE16/RE2/GP" is not specific enough; I have SE16 drives from 2005, and I highly doubt those have TLER capability due to their age. Also, there's a Wikipedia article on this whole fiasco. http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery It also appears Samsung drives have a similar feature called CCTL, which uses a value of 7 or 8 seconds: http://www.samsung.com/global/business/hdd/learningresource/whitepapers/LearningResource_CCTL.html But regardless of TLER being toggleable, FreeBSD's ATA command timeout of 5 seconds is too aggressive, and should be increased. Likewise, the value should be a sysctl, so those who do want such aggressive values can use it at the community's -- or their own -- behest. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From pyunyh at gmail.com Fri Nov 7 01:03:30 2008 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Fri Nov 7 01:03:36 2008 Subject: Panics and freeze using age0 In-Reply-To: <4913EF93.6010006@sssup.it> References: <49133B35.9040603@sssup.it> <20081107072201.GD11486@cdnetworks.co.kr> <4913EF93.6010006@sssup.it> Message-ID: <20081107090125.GG11486@cdnetworks.co.kr> On Thu, Nov 06, 2008 at 11:34:43PM -0800, Gabriele Cecchetti wrote: > Pyun YongHyeon ha scritto: > [...] > > >Would you be more elaborate on this issue? Does it mean you have > >to manually up/down game with ifconfig(8) after boot? > > Yes, I need to do that. > Sometimes happen that even if the interface > is showed during the bootlog, then it is not showed by ifconfig. > Typing ifconfig again show the interface. > Then I've to down/up the interface and/or ping outside, like the > interface was sleeping. > Would you show me dmesg output? > > > Anyone with similar experience ? > > > age0 driver is well tested for production enviroment or not ? > > > [...] > > > Is better to buy another Intel card and going on using em0 driver ? > > > >Remember, all hardwares supported by age(4) are for consumer > >motherboards. Even if L1 has much better performance/design than > >that of its successor, L1E, it's still not for server market. > >The same is true for em(4). Since there are too many variants you > >may have to choose the best hardware model that is suited for your > >workload. > The other card I'm using is: > em0@pci0:5:1:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 > hdr=0x00 > vendor = 'Intel Corporation' > device = 'PRO/1000 GT' > class = network > subclass = ethernet > > So, which card is the best buy for the server market ? Maybe other users in this list can answer that. -- Regards, Pyun YongHyeon From koitsu at FreeBSD.org Fri Nov 7 01:17:01 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Fri Nov 7 01:17:09 2008 Subject: Panics and freeze using age0 In-Reply-To: <4913EF93.6010006@sssup.it> References: <49133B35.9040603@sssup.it> <20081107072201.GD11486@cdnetworks.co.kr> <4913EF93.6010006@sssup.it> Message-ID: <20081107091659.GA1552@icarus.home.lan> On Thu, Nov 06, 2008 at 11:34:43PM -0800, Gabriele Cecchetti wrote: > The other card I'm using is: > em0@pci0:5:1:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 > hdr=0x00 > vendor = 'Intel Corporation' > device = 'PRO/1000 GT' > class = network > subclass = ethernet > > So, which card is the best buy for the server market ? That's an easy one: the Intel card. The FreeBSD em(4) and igb(4) drivers are both maintained by Jack Vogel of Intel (who is equally as friendly and reliable as the drivers :-) ). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From tinderbox at freebsd.org Fri Nov 7 03:08:29 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Fri Nov 7 03:08:37 2008 Subject: [releng_7 tinderbox] failure on amd64/amd64 Message-ID: <20081107110826.804271B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 09:22:50 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 09:22:50 - starting RELENG_7 tinderbox run for amd64/amd64 TB --- 2008-11-07 09:22:50 - cleaning the object tree TB --- 2008-11-07 09:23:18 - cvsupping the source tree TB --- 2008-11-07 09:23:18 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/amd64/amd64/supfile TB --- 2008-11-07 09:23:25 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 09:23:25 - cd /src TB --- 2008-11-07 09:23:25 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 09:23:26 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> stage 5.1: building 32 bit shim libraries >>> World build completed on Fri Nov 7 10:58:49 UTC 2008 TB --- 2008-11-07 10:58:49 - generating LINT kernel config TB --- 2008-11-07 10:58:49 - cd /src/sys/amd64/conf TB --- 2008-11-07 10:58:49 - /usr/bin/make -B LINT TB --- 2008-11-07 10:58:49 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 10:58:49 - cd /src TB --- 2008-11-07 10:58:49 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 10:58:49 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_poll.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/amd64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 11:08:26 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 11:08:26 - ERROR: failed to build lint kernel TB --- 2008-11-07 11:08:26 - tinderbox aborted TB --- 5073.91 user 567.19 system 6336.27 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-amd64-amd64.full From tinderbox at freebsd.org Fri Nov 7 03:31:31 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Fri Nov 7 03:31:49 2008 Subject: [releng_7 tinderbox] failure on i386/i386 Message-ID: <20081107113128.9185A1B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 10:13:12 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 10:13:12 - starting RELENG_7 tinderbox run for i386/i386 TB --- 2008-11-07 10:13:12 - cleaning the object tree TB --- 2008-11-07 10:13:32 - cvsupping the source tree TB --- 2008-11-07 10:13:32 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/i386/i386/supfile TB --- 2008-11-07 10:13:39 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 10:13:39 - cd /src TB --- 2008-11-07 10:13:39 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 10:13:40 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 11:21:04 UTC 2008 TB --- 2008-11-07 11:21:04 - generating LINT kernel config TB --- 2008-11-07 11:21:04 - cd /src/sys/i386/conf TB --- 2008-11-07 11:21:04 - /usr/bin/make -B LINT TB --- 2008-11-07 11:21:04 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 11:21:04 - cd /src TB --- 2008-11-07 11:21:04 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 11:21:04 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_poll.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Werror -pg -mprofiler-epilogue /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 11:31:28 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 11:31:28 - ERROR: failed to build lint kernel TB --- 2008-11-07 11:31:28 - tinderbox aborted TB --- 3807.83 user 402.11 system 4696.38 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-i386-i386.full From avg at icyb.net.ua Fri Nov 7 03:37:11 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Nov 7 03:37:18 2008 Subject: firewire disk disconnected but da* remains? Message-ID: <49142861.6020304@icyb.net.ua> I am a firewire newbie, so forgive me the following question. I disconnect external firewire HDD, firewire subsystem notices this but da0 device entry persists. Is this correct/expected behavior? on connect: kernel: fwohci0: BUS reset kernel: fwohci0: node_id=0xc800ffc1, gen=2, CYCLEMASTER mode kernel: firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) kernel: firewire0: bus manager 1 (me) kernel: fwohci0: txd err=14 ack busy_X last message repeated 2 times kernel: fwohci0: BUS reset kernel: fwohci0: node_id=0xc800ffc1, gen=3, CYCLEMASTER mode kernel: firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me) kernel: firewire0: bus manager 1 (me) kernel: firewire0: New S400 device ID:0050770e00071002 kernel: da0 at sbp0 bus 0 target 0 lun 0 kernel: da0: Fixed Simplified Direct Access SCSI-4 device kernel: da0: 50.000MB/s transfers kernel: da0: 381554MB (781422768 512 byte sectors: 255H 63S/T 48641C) kernel: GEOM_LABEL: Label for provider da0s1 is ufs/extbackup. on disconnect: kernel: fwohci0: BUS reset kernel: fwohci0: node_id=0xc800ffc0, gen=4, CYCLEMASTER mode kernel: firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) kernel: firewire0: bus manager 0 (me) camcontrol rescan all done some time later stuck in cbwait state. System is recent releng/7 amd64. -- Andriy Gapon From ken73.chen at gmail.com Fri Nov 7 03:52:13 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Fri Nov 7 03:52:21 2008 Subject: php-cgi frozen with sbwait when SMP enable Message-ID: Hello, I have 4 web servers with lighttpd to serve one web site with DNS load sharing. On the 2 SMP-enable web servers, there will be many php-cgi frozen in 'sbwait' state every day. It means the php-cgi stay in 'sbwait' state, and never be back to 'accept' or other state. If I restart them, there will be frozen php-cgi appear some hours later. There is no problem on the other single CPU web servers which running same php scripts and same configuration and version of PHP. Why and any solution? Thanks! Regards, Ken From ivoras at freebsd.org Fri Nov 7 04:13:04 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Nov 7 04:13:12 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: Ken Chen wrote: > Hello, > > I have 4 web servers with lighttpd to serve one web site with DNS load > sharing. On the 2 SMP-enable web servers, there will be many php-cgi frozen > in 'sbwait' state every day. It means the php-cgi stay in 'sbwait' state, > and never be back to 'accept' or other state. If I restart them, there will > be frozen php-cgi appear some hours later. You didn't give any information about your environment, specifically versions of FreeBSD, PHP and lighttpd you use, and in what way you use PHP (I'm guessing you're using FastCGI). > There is no problem on the other single CPU web servers which running same > php scripts and same configuration and version of PHP. > > Why and any solution? AFAIK sbwait is socket buffer wait, meaning the process is waiting for some data over a socket (and, in your case, it's not getting it). I'm using php-cgi in FastCGI mode (with mod_fcgid on Apache) on about a dozen servers, all SMP, without problems. Some Apaches are worker-threaded and some use the event MPM. It very much looks like your problem could be a bug in lighttpd. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081107/13e71f0f/signature.pgp From tinderbox at freebsd.org Fri Nov 7 04:24:57 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Fri Nov 7 04:25:09 2008 Subject: [releng_7 tinderbox] failure on i386/pc98 Message-ID: <20081107122453.298AD1B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 11:08:26 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 11:08:26 - starting RELENG_7 tinderbox run for i386/pc98 TB --- 2008-11-07 11:08:26 - cleaning the object tree TB --- 2008-11-07 11:08:41 - cvsupping the source tree TB --- 2008-11-07 11:08:41 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/i386/pc98/supfile TB --- 2008-11-07 11:08:49 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 11:08:49 - cd /src TB --- 2008-11-07 11:08:49 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 11:08:50 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 12:15:49 UTC 2008 TB --- 2008-11-07 12:15:49 - generating LINT kernel config TB --- 2008-11-07 12:15:49 - cd /src/sys/pc98/conf TB --- 2008-11-07 12:15:49 - /usr/bin/make -B LINT TB --- 2008-11-07 12:15:49 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 12:15:49 - cd /src TB --- 2008-11-07 12:15:49 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 12:15:50 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_ntptime.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_physio.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_pmc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_poll.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_priv.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -pg -mprofiler-epilogue /src/sys/kern/kern_proc.c /src/sys/kern/kern_proc.c: In function 'sysctl_kern_proc_vmmap': /src/sys/kern/kern_proc.c:1454: error: too few arguments to function 'VOP_GETATTR' *** Error code 1 Stop in /obj/pc98/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 12:24:52 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 12:24:52 - ERROR: failed to build lint kernel TB --- 2008-11-07 12:24:52 - tinderbox aborted TB --- 3686.23 user 398.26 system 4586.38 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-i386-pc98.full From tinderbox at freebsd.org Fri Nov 7 05:15:18 2008 From: tinderbox at freebsd.org (FreeBSD Tinderbox) Date: Fri Nov 7 05:15:35 2008 Subject: [releng_7 tinderbox] failure on ia64/ia64 Message-ID: <20081107131515.10BA31B5078@freebsd-stable.sentex.ca> TB --- 2008-11-07 11:31:28 - tinderbox 2.4 running on freebsd-stable.sentex.ca TB --- 2008-11-07 11:31:28 - starting RELENG_7 tinderbox run for ia64/ia64 TB --- 2008-11-07 11:31:28 - cleaning the object tree TB --- 2008-11-07 11:31:42 - cvsupping the source tree TB --- 2008-11-07 11:31:42 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/RELENG_7/ia64/ia64/supfile TB --- 2008-11-07 11:31:49 - building world (CFLAGS=-O2 -pipe) TB --- 2008-11-07 11:31:49 - cd /src TB --- 2008-11-07 11:31:49 - /usr/bin/make -B buildworld >>> World build started on Fri Nov 7 11:31:50 UTC 2008 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri Nov 7 13:01:36 UTC 2008 TB --- 2008-11-07 13:01:36 - generating LINT kernel config TB --- 2008-11-07 13:01:36 - cd /src/sys/ia64/conf TB --- 2008-11-07 13:01:36 - /usr/bin/make -B LINT TB --- 2008-11-07 13:01:36 - building LINT kernel (COPTFLAGS=-O2 -pipe) TB --- 2008-11-07 13:01:36 - cd /src TB --- 2008-11-07 13:01:36 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Nov 7 13:01:36 UTC 2008 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/netgraph/ng_tty.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/netgraph/ng_vjc.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/netinet/accf_data.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/netinet/accf_http.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/netinet/if_atm.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/netinet/if_ether.c /src/sys/netinet/if_ether.c: In function 'arpresolve': /src/sys/netinet/if_ether.c:399: error: dereferencing pointer to incomplete type *** Error code 1 Stop in /obj/ia64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2008-11-07 13:15:14 - WARNING: /usr/bin/make returned exit code 1 TB --- 2008-11-07 13:15:14 - ERROR: failed to build lint kernel TB --- 2008-11-07 13:15:14 - tinderbox aborted TB --- 5262.35 user 404.68 system 6226.05 real http://tinderbox.des.no/tinderbox-releng_7-RELENG_7-ia64-ia64.full From ken73.chen at gmail.com Fri Nov 7 06:10:52 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Fri Nov 7 06:10:58 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: Oh.. sorry, I forgot to provide the information of my environment. web4# php-cgi -v PHP 5.2.6 (cgi-fcgi) (built: Nov 2 2008 11:16:30) Copyright (c) 1997-2008 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies with XCache v1.2.2, Copyright (c) 2005-2007, by mOo web4# /usr/local/lighttpd/sbin/lighttpd -v lighttpd-1.4.19 - a light and fast webserver Build-Date: Sep 1 2008 16:58:51 web4# uname -a FreeBSD web4.xxxx.com 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #11: Mon Nov 3 01:10:36 CST 2008 root@web4.xxxx.com:/usr/obj/usr/src/sys/WEB4 i386 web4# ps alx | grep php-cgi | grep -v grep | grep sbwait 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 /usr/local/bin/php-cgi 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 /usr/local/bin/php-cgi 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 /usr/local/bin/php-cgi 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 /usr/local/bin/php-cgi 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 /usr/local/bin/php-cgi 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 /usr/local/bin/php-cgi 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 /usr/local/bin/php-cgi 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 /usr/local/bin/php-cgi 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 /usr/local/bin/php-cgi 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 /usr/local/bin/php-cgi 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 /usr/local/bin/php-cgi 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 /usr/local/bin/php-cgi From koitsu at FreeBSD.org Fri Nov 7 06:21:38 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Fri Nov 7 06:21:47 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: <20081107142137.GA7051@icarus.home.lan> On Fri, Nov 07, 2008 at 07:29:37PM +0800, Ken Chen wrote: > Hello, > > I have 4 web servers with lighttpd to serve one web site with DNS load > sharing. On the 2 SMP-enable web servers, there will be many php-cgi frozen > in 'sbwait' state every day. It means the php-cgi stay in 'sbwait' state, > and never be back to 'accept' or other state. If I restart them, there will > be frozen php-cgi appear some hours later. > > There is no problem on the other single CPU web servers which running same > php scripts and same configuration and version of PHP. > > Why and any solution? I'm not understanding what the problem is (and I've seen the output you provided later in the thread). Are you stating the problem is that you see many php-cgi processes? Or are you worried they're not doing anything? Does the website function, lock up, or anything like that? If not, what's the issue? :-) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From ken73.chen at gmail.com Fri Nov 7 06:46:00 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Fri Nov 7 06:46:07 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: <20081107142137.GA7051@icarus.home.lan> References: <20081107142137.GA7051@icarus.home.lan> Message-ID: Hi Jeremy, A health FastCGI process have a lifetime, so the PIDs of all php-cgi processes should in a short range. There are some 'php-cgi' fall in 'sbwait' state, and stay there forever. The frozen 'php-cgi' can't accept new request, so never retire. Please forgive my poor English. 2008/11/7 Jeremy Chadwick > On Fri, Nov 07, 2008 at 07:29:37PM +0800, Ken Chen wrote: > > Hello, > > > > I have 4 web servers with lighttpd to serve one web site with DNS load > > sharing. On the 2 SMP-enable web servers, there will be many php-cgi > frozen > > in 'sbwait' state every day. It means the php-cgi stay in 'sbwait' state, > > and never be back to 'accept' or other state. If I restart them, there > will > > be frozen php-cgi appear some hours later. > > > > There is no problem on the other single CPU web servers which running > same > > php scripts and same configuration and version of PHP. > > > > Why and any solution? > > I'm not understanding what the problem is (and I've seen the output you > provided later in the thread). Are you stating the problem is that you > see many php-cgi processes? Or are you worried they're not doing > anything? Does the website function, lock up, or anything like that? > If not, what's the issue? :-) > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > From ivoras at freebsd.org Fri Nov 7 06:51:00 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Nov 7 06:51:06 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: Ken Chen wrote: > Oh.. sorry, I forgot to provide the information of my environment. > > web4# php-cgi -v > PHP 5.2.6 (cgi-fcgi) (built: Nov 2 2008 11:16:30) > Copyright (c) 1997-2008 The PHP Group > Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies > with XCache v1.2.2, Copyright (c) 2005-2007, by mOo > web4# /usr/local/lighttpd/sbin/lighttpd -v > lighttpd-1.4.19 - a light and fast webserver > Build-Date: Sep 1 2008 16:58:51 > web4# uname -a > FreeBSD web4.xxxx.com 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #11: Mon Nov 3 > 01:10:36 CST 2008 root@web4.xxxx.com:/usr/obj/usr/src/sys/WEB4 i386 > web4# ps alx | grep php-cgi | grep -v grep | grep sbwait > 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 > /usr/local/bin/php-cgi > 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 > /usr/local/bin/php-cgi > 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 > /usr/local/bin/php-cgi > 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 > /usr/local/bin/php-cgi > 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 > /usr/local/bin/php-cgi > 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 > /usr/local/bin/php-cgi > 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 > /usr/local/bin/php-cgi > 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 > /usr/local/bin/php-cgi > 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 > /usr/local/bin/php-cgi > 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 > /usr/local/bin/php-cgi > 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 > /usr/local/bin/php-cgi > 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 > /usr/local/bin/php-cgi This does seem a bit unusual, but seeing that your execution times are not null it might that the PHP servers are actually doing some useful work. You should have a mixture of various states in PHP - do they show up in top? My own example is: last pid: 77421; load averages: 2.82, 2.59, 2.13 up 55+16:58:49 15:48:16 209 processes: 2 running, 206 sleeping, 1 zombie CPU: 49.8% user, 0.0% nice, 2.8% system, 0.0% interrupt, 47.4% idle Mem: 1493M Active, 1583M Inact, 278M Wired, 139M Cache, 112M Buf, 505M Free Swap: 4500M Total, 416M Used, 4084M Free, 9% Inuse PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 75863 www 1 4 0 162M 50020K sbwait 3 2:54 36.77% php-cgi 76830 www 1 103 0 156M 41556K CPU2 3 1:28 36.77% php-cgi 76834 www 1 4 0 163M 56628K sbwait 0 2:23 33.59% php-cgi 76019 www 1 4 0 150M 38948K accept 3 3:12 20.56% php-cgi 76825 www 1 4 0 158M 42912K accept 2 1:21 18.16% php-cgi 76846 www 1 4 0 162M 42600K sbwait 1 1:07 14.36% php-cgi 76835 www 1 4 0 151M 39948K accept 2 1:28 12.60% php-cgi 76829 www 1 4 0 150M 36564K sbwait 2 1:46 2.98% php-cgi This is unusually high load, a spike, for this server but it has many cores and it's stable. It's also running 7.1-PRERELEASE. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081107/d52b02cb/signature.pgp From ken73.chen at gmail.com Fri Nov 7 07:08:20 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Fri Nov 7 07:08:29 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: I capture something. Please check the PID 57776. It's CPU time never change since my previous mail here. web4# ps alx | grep php-cgi | grep -v grep | grep sbwait 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 /usr/local/bin/php-cgi 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 /usr/local/bin/php-cgi 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 /usr/local/bin/php-cgi 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 /usr/local/bin/php-cgi 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 /usr/local/bin/php-cgi 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 /usr/local/bin/php-cgi 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 /usr/local/bin/php-cgi 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 /usr/local/bin/php-cgi 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 /usr/local/bin/php-cgi 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 /usr/local/bin/php-cgi 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 /usr/local/bin/php-cgi 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 /usr/local/bin/php-cgi 65534 66007 47240 0 4 0 183352 90960 sbwait I ?? 1:16.81 /usr/local/bin/php-cgi 65534 66014 47240 5 4 0 182328 92748 sbwait S ?? 1:41.23 /usr/local/bin/php-cgi 65534 66038 47240 1 4 0 182328 91900 sbwait I ?? 1:38.04 /usr/local/bin/php-cgi 65534 66060 47240 0 4 0 182328 90048 sbwait I ?? 1:15.46 /usr/local/bin/php-cgi 65534 66078 47240 3 4 0 182328 92224 sbwait S ?? 1:39.66 /usr/local/bin/php-cgi web4# top -b last pid: 70768; load averages: 1.62, 1.65, 1.43 up 4+15:56:06 22:53:48 85 processes: 1 running, 84 sleeping Mem: 492M Active, 1204M Inact, 218M Wired, 60M Cache, 112M Buf, 27M Free Swap: 2019M Total, 20K Used, 2019M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 69544 nobody 1 8 0 203M 38500K nanslp 1 6:31 11.33% php 47290 nobody 1 4 0 101M 98M kqread 1 30:42 2.98% lighttpd 66526 nobody 1 4 0 178M 92796K accept 1 1:40 1.12% php-cgi 66077 nobody 1 4 0 178M 92512K accept 0 1:49 1.07% php-cgi 65921 nobody 1 4 0 178M 92696K accept 0 1:43 0.98% php-cgi 65968 nobody 1 4 0 178M 92484K accept 0 1:43 0.93% php-cgi 66017 nobody 1 4 0 178M 92444K accept 0 1:50 0.88% php-cgi 65979 nobody 1 4 0 178M 92676K accept 1 1:44 0.88% php-cgi 66424 nobody 1 4 0 178M 92928K accept 1 1:36 0.88% php-cgi 65938 nobody 1 4 0 178M 92336K accept 1 1:52 0.73% php-cgi 65951 nobody 1 4 0 178M 92704K accept 0 1:48 0.73% php-cgi 66016 nobody 1 4 0 178M 92232K accept 1 1:41 0.73% php-cgi 65950 nobody 1 4 0 178M 93192K accept 0 1:51 0.68% php-cgi 65999 nobody 1 4 0 178M 92940K accept 1 1:46 0.63% php-cgi 66008 nobody 1 4 0 178M 93000K accept 1 1:46 0.63% php-cgi 69286 nobody 1 4 0 178M 92208K accept 1 0:37 0.63% php-cgi 47289 nobody 1 4 0 73400K 70640K kqread 1 12:02 0.59% lighttpd 65980 nobody 1 4 0 178M 93156K accept 1 1:51 0.59% php-cgi 2008/11/7 Ivan Voras > Ken Chen wrote: > > Oh.. sorry, I forgot to provide the information of my environment. > > > > web4# php-cgi -v > > PHP 5.2.6 (cgi-fcgi) (built: Nov 2 2008 11:16:30) > > Copyright (c) 1997-2008 The PHP Group > > Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies > > with XCache v1.2.2, Copyright (c) 2005-2007, by mOo > > web4# /usr/local/lighttpd/sbin/lighttpd -v > > lighttpd-1.4.19 - a light and fast webserver > > Build-Date: Sep 1 2008 16:58:51 > > web4# uname -a > > FreeBSD web4.xxxx.com 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #11: Mon Nov > 3 > > 01:10:36 CST 2008 root@web4.xxxx.com:/usr/obj/usr/src/sys/WEB4 i386 > > web4# ps alx | grep php-cgi | grep -v grep | grep sbwait > > 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 > > /usr/local/bin/php-cgi > > 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 > > /usr/local/bin/php-cgi > > 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 > > /usr/local/bin/php-cgi > > 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 > > /usr/local/bin/php-cgi > > 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 > > /usr/local/bin/php-cgi > > 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 > > /usr/local/bin/php-cgi > > 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 > > /usr/local/bin/php-cgi > > 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 > > /usr/local/bin/php-cgi > > 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 > > /usr/local/bin/php-cgi > > 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 > > /usr/local/bin/php-cgi > > 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 > > /usr/local/bin/php-cgi > > 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 > > /usr/local/bin/php-cgi > > This does seem a bit unusual, but seeing that your execution times are > not null it might that the PHP servers are actually doing some useful > work. You should have a mixture of various states in PHP - do they show > up in top? > > My own example is: > > last pid: 77421; load averages: 2.82, 2.59, 2.13 > up > 55+16:58:49 15:48:16 > 209 processes: 2 running, 206 sleeping, 1 zombie > CPU: 49.8% user, 0.0% nice, 2.8% system, 0.0% interrupt, 47.4% idle > Mem: 1493M Active, 1583M Inact, 278M Wired, 139M Cache, 112M Buf, 505M Free > Swap: 4500M Total, 416M Used, 4084M Free, 9% Inuse > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 75863 www 1 4 0 162M 50020K sbwait 3 2:54 36.77% php-cgi > 76830 www 1 103 0 156M 41556K CPU2 3 1:28 36.77% php-cgi > 76834 www 1 4 0 163M 56628K sbwait 0 2:23 33.59% php-cgi > 76019 www 1 4 0 150M 38948K accept 3 3:12 20.56% php-cgi > 76825 www 1 4 0 158M 42912K accept 2 1:21 18.16% php-cgi > 76846 www 1 4 0 162M 42600K sbwait 1 1:07 14.36% php-cgi > 76835 www 1 4 0 151M 39948K accept 2 1:28 12.60% php-cgi > 76829 www 1 4 0 150M 36564K sbwait 2 1:46 2.98% php-cgi > > This is unusually high load, a spike, for this server but it has many > cores and it's stable. It's also running 7.1-PRERELEASE. > > From gavin at FreeBSD.org Fri Nov 7 07:42:17 2008 From: gavin at FreeBSD.org (Gavin Atkinson) Date: Fri Nov 7 07:42:30 2008 Subject: firewire disk disconnected but da* remains? In-Reply-To: <49142861.6020304@icyb.net.ua> References: <49142861.6020304@icyb.net.ua> Message-ID: <1226072532.69416.4.camel@buffy.york.ac.uk> On Fri, 2008-11-07 at 13:37 +0200, Andriy Gapon wrote: > I am a firewire newbie, so forgive me the following question. > I disconnect external firewire HDD, firewire subsystem notices this but > da0 device entry persists. Is this correct/expected behavior? Yes. From sbp(4): Some users familiar with umass(4) might wonder why the device is not detached at the CAM layer when the device is unplugged. It is detached only if the device has not been plugged again during several bus resets. This is for preventing to detach an active file system even when the device cannot be probed correctly for some reason after a bus reset or when the device is temporary disconnected because the user changes the bus topology. If you want to force to detach the device, run ``fwcontrol -r'' several times or set hw.firewire.hold_count=0 by sysctl(1). Gavin From jhs at berklix.org Fri Nov 7 08:00:09 2008 From: jhs at berklix.org (Julian Stacey) Date: Fri Nov 7 08:00:23 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: Your message "Fri, 07 Nov 2008 00:36:26 PST." <20081107083626.GA1583@icarus.home.lan> Message-ID: <200811071541.mA7FfKEF021236@fire.js.berklix.net> > But regardless of TLER being toggleable, FreeBSD's ATA command timeout > of 5 seconds is too aggressive, and should be increased. Likewise, the > value should be a sysctl, so those who do want such aggressive values Once it migrates from a constant to sysctl variable, could kernel maybe also sniff the drives, & automatically set appropriate value ? (Just an idea ? :-) Cheers, Julian -- Julian Stacey: BSDUnixLinux C Prog Admin SysEng Consult Munich www.berklix.com Mail plain ASCII text. HTML & Base64 text are spam. www.asciiribbon.org From kensmith at cse.Buffalo.EDU Fri Nov 7 09:17:21 2008 From: kensmith at cse.Buffalo.EDU (Ken Smith) Date: Fri Nov 7 09:17:28 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime In-Reply-To: <9592E887-75F3-473F-9581-F9C22A9936A6@TrueStep.com> References: <9592E887-75F3-473F-9581-F9C22A9936A6@TrueStep.com> Message-ID: <1226078239.37011.37.camel@bauer.cse.buffalo.edu> On Fri, 2008-11-07 at 00:00 -0500, Rory Arms wrote: > Well, if I can assist with further debugging, let me know. The person who followed up with a list of things that *may* have made the problem go away mentioned one of the things was disabling powerd. Do you have that enable, and if yes would you mind disabling it to see if that's the culprit? Thanks for the report. -- Ken Smith - From there to here, from here to | kensmith@cse.buffalo.edu there, funny things are everywhere. | - Theodore Geisel | -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081107/e8f69105/attachment.pgp From avg at icyb.net.ua Fri Nov 7 09:24:15 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Nov 7 09:24:28 2008 Subject: firewire disk disconnected but da* remains? In-Reply-To: <1226072532.69416.4.camel@buffy.york.ac.uk> References: <49142861.6020304@icyb.net.ua> <1226072532.69416.4.camel@buffy.york.ac.uk> Message-ID: <491479BB.10404@icyb.net.ua> on 07/11/2008 17:42 Gavin Atkinson said the following: > On Fri, 2008-11-07 at 13:37 +0200, Andriy Gapon wrote: >> I am a firewire newbie, so forgive me the following question. >> I disconnect external firewire HDD, firewire subsystem notices this but >> da0 device entry persists. Is this correct/expected behavior? > > Yes. From sbp(4): > > Some users familiar with umass(4) might wonder why the device is not > detached at the CAM layer when the device is unplugged. It is detached > only if the device has not been plugged again during several bus resets. > This is for preventing to detach an active file system even when the > device cannot be probed correctly for some reason after a bus reset or > when the device is temporary disconnected because the user changes the > bus topology. If you want to force to detach the device, run ``fwcontrol > -r'' several times or set hw.firewire.hold_count=0 by sysctl(1). Thanks a lot! I should have RTFM. -- Andriy Gapon From avg at icyb.net.ua Fri Nov 7 09:38:46 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Nov 7 09:38:53 2008 Subject: /etc/ttys oddity Message-ID: <49147D23.3030503@icyb.net.ua> I have the following line in /etc/ttys: ttyv8 "/usr/local/bin/kdm -nodaemon" xterm on insecure Because of X misconfiguration it constantly crashed, so: kdm-bin[1178]: Unable to fire up local display :0; disabling. So I fix xorg.conf, then I change on => off in ttys, then I do kill -1 1, and X gets started! Seems illogical. Or maybe kdm-bin does something "smart" behind the scenes. -- Andriy Gapon From peter at wemm.org Fri Nov 7 11:12:57 2008 From: peter at wemm.org (Peter Wemm) Date: Fri Nov 7 11:13:05 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: <20081107071752.GA5842@icarus.home.lan> References: <20081107071752.GA5842@icarus.home.lan> Message-ID: On Thu, Nov 6, 2008 at 11:17 PM, Jeremy Chadwick wrote: [..] > As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, and > is not adjustable without editing the ATA code yourself and increasing > the value. The FreeNAS folks have made patches available to turn the > timeout value into a sysctl. > > Soren and/or others, please increase this timeout value. Five seconds > has now been deemed too aggressive a default. And please consider > migrating the timeout value into a sysctl. The 5 second timeout has been a problem for quite a while actually. I've had a number of instances where I've had to increase it to 20 or 30 seconds when recovering from marginal drives. The longest "successful" recovery attempt I've seen was 26 seconds, I believe on a Maxtor drive a few years ago. ("successful" == the drive spent 26 seconds but eventually successfully read the sector). Even the IBM death star drives could take much longer than 5 seconds to do a recovery 5 years ago. 5 seconds has never been a good default. I think the timeout should be increased to at least 30 seconds. My windows box has a timeout that goes for several minutes. If there is concern about FreeBSD appearing to hang, I could imagine that a console warning message could be printed after 5 seconds. But just say "drive has not yet responded". But give it more time. In this day and age we're generally not playing games with udma33 vs 66, notched cables, poor CRC support etc. SATA seems to have eliminated all that. Hmm, it might make sense to increase the timeout on SATA connections to 2 or 3 minutes by default. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell From gabriele.cecchetti at gmail.com Fri Nov 7 12:26:32 2008 From: gabriele.cecchetti at gmail.com (Gabriele Cecchetti) Date: Fri Nov 7 12:26:40 2008 Subject: Panics and freeze using age0 In-Reply-To: <20081107090125.GG11486@cdnetworks.co.kr> References: <49133B35.9040603@sssup.it> <20081107072201.GD11486@cdnetworks.co.kr> <4913EF93.6010006@sssup.it> <20081107090125.GG11486@cdnetworks.co.kr> Message-ID: <4914A46F.2000309@sssup.it> Pyun YongHyeon ha scritto: > Would you show me dmesg output? > This is the dmesg output, note that age0 is detected after superuser login who start ifconfi-ing. Nov 5 12:12:35 granpasso kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Nov 5 12:12:35 granpasso kernel: The Regents of the University of California. All rights reserved. Nov 5 12:12:35 granpasso kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Nov 5 12:12:35 granpasso kernel: FreeBSD 7.1-PRERELEASE #2: Wed Oct 1 17:30:35 CEST 2008 Nov 5 12:12:35 granpasso kernel: root@granpasso.retis:/usr/obj/usr/src/sys/GRANPASSOv3 Nov 5 12:12:35 granpasso kernel: Timecounter "i8254" frequency 1193182 Hz quality 0 Nov 5 12:12:35 granpasso kernel: CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz (2400.10-MHz K8-class CPU) Nov 5 12:12:35 granpasso kernel: Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 Nov 5 12:12:35 granpasso kernel: Features=0xbfebfbff Nov 5 12:12:35 granpasso kernel: Features2=0xe3bd Nov 5 12:12:35 granpasso kernel: AMD Features=0x20100800 Nov 5 12:12:35 granpasso kernel: AMD Features2=0x1 Nov 5 12:12:35 granpasso kernel: Cores per package: 4 Nov 5 12:12:35 granpasso kernel: usable memory = 4285071360 (4086 MB) Nov 5 12:12:35 granpasso kernel: avail memory = 4131549184 (3940 MB) Nov 5 12:12:35 granpasso kernel: ACPI APIC Table: Nov 5 12:12:35 granpasso kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs Nov 5 12:12:35 granpasso kernel: cpu0 (BSP): APIC ID: 0 Nov 5 12:12:35 granpasso kernel: cpu1 (AP): APIC ID: 1 Nov 5 12:12:35 granpasso kernel: cpu2 (AP): APIC ID: 2 Nov 5 12:12:35 granpasso kernel: cpu3 (AP): APIC ID: 3 Nov 5 12:12:35 granpasso kernel: ioapic0 irqs 0-23 on motherboard Nov 5 12:12:35 granpasso kernel: kbd1 at kbdmux0 Nov 5 12:12:35 granpasso kernel: netsmb_dev: loaded Nov 5 12:12:35 granpasso kernel: ichwd module loaded Nov 5 12:12:35 granpasso kernel: smbios0: at iomem 0xfc4f0-0xfc50e on motherboard Nov 5 12:12:35 granpasso kernel: smbios0: Version: 2.4, BCD Revision: 2.4 Nov 5 12:12:35 granpasso kernel: acpi0: on motherboard Nov 5 12:12:35 granpasso kernel: acpi0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: acpi0: Power Button (fixed) Nov 5 12:12:35 granpasso kernel: acpi0: reservation of 0, a0000 (3) failed Nov 5 12:12:35 granpasso kernel: acpi0: reservation of e0000, 20000 (3) failed Nov 5 12:12:35 granpasso kernel: acpi0: reservation of 100000, dff00000 (3) failed Nov 5 12:12:35 granpasso kernel: Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 Nov 5 12:12:35 granpasso kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 Nov 5 12:12:35 granpasso kernel: acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Nov 5 12:12:35 granpasso kernel: Timecounter "HPET" frequency 14318180 Hz quality 900 Nov 5 12:12:35 granpasso kernel: pcib0: port 0xcf8-0xcff on acpi0 Nov 5 12:12:35 granpasso kernel: pci0: on pcib0 Nov 5 12:12:35 granpasso kernel: pcib1: irq 16 at device 1.0 on pci0 Nov 5 12:12:35 granpasso kernel: pci1: on pcib1 Nov 5 12:12:35 granpasso kernel: 3ware device driver for 9000 series storage controllers, version: 3.70.05.001 Nov 5 12:12:35 granpasso kernel: twa0: <3ware 9000 series Storage Controller> port 0xc800-0xc8ff mem 0xf4000000-0xf5ffffff,0xf7dff000-0xf7dfffff irq 16 at device 0.0 on pci1 Nov 5 12:12:35 granpasso kernel: twa0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8 ports, Firmware FE9X 3.08.00.004, BIOS BE9X 3.08.00.002 Nov 5 12:12:35 granpasso kernel: uhci0: port 0xb800-0xb81f irq 16 at device 26.0 on pci0 Nov 5 12:12:35 granpasso kernel: uhci0: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: uhci0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb0: on uhci0 Nov 5 12:12:35 granpasso kernel: usb0: USB revision 1.0 Nov 5 12:12:35 granpasso kernel: uhub0: on usb0 Nov 5 12:12:35 granpasso kernel: uhub0: 2 ports with 2 removable, self powered Nov 5 12:12:35 granpasso kernel: uhci1: port 0xb880-0xb89f irq 21 at device 26.1 on pci0 Nov 5 12:12:35 granpasso kernel: uhci1: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: uhci1: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb1: on uhci1 Nov 5 12:12:35 granpasso kernel: usb1: USB revision 1.0 Nov 5 12:12:35 granpasso kernel: uhub1: on usb1 Nov 5 12:12:35 granpasso kernel: uhub1: 2 ports with 2 removable, self powered Nov 5 12:12:35 granpasso kernel: uhci2: port 0xbc00-0xbc1f irq 18 at device 26.2 on pci0 Nov 5 12:12:35 granpasso kernel: uhci2: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: uhci2: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb2: on uhci2 Nov 5 12:12:35 granpasso kernel: usb2: USB revision 1.0 Nov 5 12:12:35 granpasso kernel: uhub2: on usb2 Nov 5 12:12:35 granpasso kernel: uhub2: 2 ports with 2 removable, self powered Nov 5 12:12:35 granpasso kernel: ehci0: mem 0xf7cffc00-0xf7cfffff irq 18 at device 26.7 on pci0 Nov 5 12:12:35 granpasso kernel: ehci0: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: ehci0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb3: EHCI version 1.0 Nov 5 12:12:35 granpasso kernel: usb3: companion controllers, 2 ports each: usb0 usb1 usb2 Nov 5 12:12:35 granpasso kernel: usb3: on ehci0 Nov 5 12:12:35 granpasso kernel: usb3: USB revision 2.0 Nov 5 12:12:35 granpasso kernel: uhub3: on usb3 Nov 5 12:12:35 granpasso kernel: uhub3: 6 ports with 6 removable, self powered Nov 5 12:12:35 granpasso kernel: pci0: at device 27.0 (no driver attached) Nov 5 12:12:35 granpasso kernel: pcib2: irq 17 at device 28.0 on pci0 Nov 5 12:12:35 granpasso kernel: pci4: on pcib2 Nov 5 12:12:35 granpasso kernel: pcib3: irq 17 at device 28.4 on pci0 Nov 5 12:12:35 granpasso kernel: pci3: on pcib3 Nov 5 12:12:35 granpasso kernel: atapci0: mem 0xf7ffe000-0xf7ffffff irq 16 at device 0.0 on pci3 Nov 5 12:12:35 granpasso kernel: atapci0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: atapci0: AHCI Version 01.00 controller with 2 ports detected Nov 5 12:12:35 granpasso kernel: ata2: on atapci0 Nov 5 12:12:35 granpasso kernel: ata2: [ITHREAD] Nov 5 12:12:35 granpasso kernel: ata3: on atapci0 Nov 5 12:12:35 granpasso kernel: ata3: [ITHREAD] Nov 5 12:12:35 granpasso kernel: atapci1: port 0xdc00-0xdc07,0xd880-0xd883,0xd800-0xd807,0xd480-0xd483,0xd400-0xd40f at device 0.1 on pci3 Nov 5 12:12:35 granpasso kernel: atapci1: [ITHREAD] Nov 5 12:12:35 granpasso kernel: ata4: on atapci1 Nov 5 12:12:35 granpasso kernel: ata4: [ITHREAD] Nov 5 12:12:35 granpasso kernel: pcib4: irq 16 at device 28.5 on pci0 Nov 5 12:12:35 granpasso kernel: pci2: on pcib4 Nov 5 12:12:35 granpasso kernel: pci2: at device 0.0 (no driver attached) Nov 5 12:12:35 granpasso kernel: uhci3: port 0xb080-0xb09f irq 23 at device 29.0 on pci0 Nov 5 12:12:35 granpasso kernel: uhci3: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: uhci3: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb4: on uhci3 Nov 5 12:12:35 granpasso kernel: usb4: USB revision 1.0 Nov 5 12:12:35 granpasso kernel: uhub4: on usb4 Nov 5 12:12:35 granpasso kernel: uhub4: 2 ports with 2 removable, self powered Nov 5 12:12:35 granpasso kernel: uhci4: port 0xb400-0xb41f irq 19 at device 29.1 on pci0 Nov 5 12:12:35 granpasso kernel: uhci4: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: uhci4: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb5: on uhci4 Nov 5 12:12:35 granpasso kernel: usb5: USB revision 1.0 Nov 5 12:12:35 granpasso kernel: uhub5: on usb5 Nov 5 12:12:35 granpasso kernel: uhub5: 2 ports with 2 removable, self powered Nov 5 12:12:35 granpasso kernel: uhci5: port 0xb480-0xb49f irq 18 at device 29.2 on pci0 Nov 5 12:12:35 granpasso kernel: uhci5: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: uhci5: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb6: on uhci5 Nov 5 12:12:35 granpasso kernel: usb6: USB revision 1.0 Nov 5 12:12:35 granpasso kernel: uhub6: on usb6 Nov 5 12:12:35 granpasso kernel: uhub6: 2 ports with 2 removable, self powered Nov 5 12:12:35 granpasso kernel: ehci1: mem 0xf7cff800-0xf7cffbff irq 23 at device 29.7 on pci0 Nov 5 12:12:35 granpasso kernel: ehci1: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: ehci1: [ITHREAD] Nov 5 12:12:35 granpasso kernel: usb7: EHCI version 1.0 Nov 5 12:12:35 granpasso kernel: usb7: companion controllers, 2 ports each: usb4 usb5 usb6 Nov 5 12:12:35 granpasso kernel: usb7: on ehci1 Nov 5 12:12:35 granpasso kernel: usb7: USB revision 2.0 Nov 5 12:12:35 granpasso kernel: uhub7: on usb7 Nov 5 12:12:35 granpasso kernel: uhub7: 6 ports with 6 removable, self powered Nov 5 12:12:35 granpasso kernel: pcib5: at device 30.0 on pci0 Nov 5 12:12:35 granpasso kernel: pci5: on pcib5 Nov 5 12:12:35 granpasso kernel: em0: port 0xec00-0xec3f mem 0xfebe0000-0xfebfffff,0xfebc0000-0xfebdffff irq 17 at device 1.0 on pci5 Nov 5 12:12:35 granpasso kernel: em0: [FILTER] Nov 5 12:12:35 granpasso kernel: em0: Ethernet address: 00:0e:0c:c0:f9:fc Nov 5 12:12:35 granpasso kernel: vgapci0: mem 0xf8000000-0xfbffffff irq 18 at device 2.0 on pci5 Nov 5 12:12:35 granpasso kernel: fwohci0: port 0xe880-0xe8ff mem 0xfeb8f800-0xfeb8ffff irq 16 at device 3.0 on pci5 Nov 5 12:12:35 granpasso kernel: fwohci0: [FILTER] Nov 5 12:12:35 granpasso kernel: fwohci0: OHCI version 1.10 (ROM=1) Nov 5 12:12:35 granpasso kernel: fwohci0: No. of Isochronous channels is 4. Nov 5 12:12:35 granpasso kernel: fwohci0: EUI64 00:11:d8:00:01:8a:c3:a0 Nov 5 12:12:35 granpasso kernel: fwohci0: Phy 1394a available S400, 2 ports. Nov 5 12:12:35 granpasso kernel: fwohci0: Link S400, max_rec 2048 bytes. Nov 5 12:12:35 granpasso kernel: firewire0: on fwohci0 Nov 5 12:12:35 granpasso kernel: fwip0: on firewire0 Nov 5 12:12:35 granpasso kernel: fwip0: Firewire address: 00:11:d8:00:01:8a:c3:a0 @ 0xfffe00000000, S400, maxrec 2048 Nov 5 12:12:35 granpasso kernel: sbp0: on firewire0 Nov 5 12:12:35 granpasso kernel: fwohci0: Initiate bus reset Nov 5 12:12:35 granpasso kernel: fwohci0: BUS reset Nov 5 12:12:35 granpasso kernel: fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode Nov 5 12:12:35 granpasso kernel: isab0: at device 31.0 on pci0 Nov 5 12:12:35 granpasso kernel: isa0: on isab0 Nov 5 12:12:35 granpasso kernel: ichsmb0: port 0x400-0x41f mem 0xf7cff400-0xf7cff4ff irq 18 at device 31.3 on pci0 Nov 5 12:12:35 granpasso kernel: ichsmb0: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: ichsmb0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: smbus0: on ichsmb0 Nov 5 12:12:35 granpasso kernel: smb0: on smbus0 Nov 5 12:12:35 granpasso kernel: acpi_button0: on acpi0 Nov 5 12:12:35 granpasso kernel: speaker0: port 0x61 on acpi0 Nov 5 12:12:35 granpasso kernel: sio0: configured irq 4 not in bitmap of probed irqs 0 Nov 5 12:12:35 granpasso kernel: sio0: port may not be enabled Nov 5 12:12:35 granpasso kernel: sio0: configured irq 4 not in bitmap of probed irqs 0 Nov 5 12:12:35 granpasso kernel: sio0: port may not be enabled Nov 5 12:12:35 granpasso kernel: sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 Nov 5 12:12:35 granpasso kernel: sio0: type 16550A Nov 5 12:12:35 granpasso kernel: sio0: [FILTER] Nov 5 12:12:35 granpasso kernel: atkbdc0: port 0x60,0x64 irq 1 on acpi0 Nov 5 12:12:35 granpasso kernel: atkbd0: irq 1 on atkbdc0 Nov 5 12:12:35 granpasso kernel: kbd0 at atkbd0 Nov 5 12:12:35 granpasso kernel: atkbd0: [GIANT-LOCKED] Nov 5 12:12:35 granpasso kernel: atkbd0: [ITHREAD] Nov 5 12:12:35 granpasso kernel: cpu0: on acpi0 Nov 5 12:12:35 granpasso kernel: ACPI Warning (tbutils-0243): Incorrect checksum in table [OEMB] - 61, should be 60 [20070320] Nov 5 12:12:35 granpasso kernel: coretemp0: on cpu0 Nov 5 12:12:35 granpasso kernel: est0: on cpu0 Nov 5 12:12:35 granpasso kernel: p4tcc0: on cpu0 Nov 5 12:12:35 granpasso kernel: cpu1: on acpi0 Nov 5 12:12:35 granpasso kernel: coretemp1: on cpu1 Nov 5 12:12:35 granpasso kernel: est1: on cpu1 Nov 5 12:12:35 granpasso kernel: p4tcc1: on cpu1 Nov 5 12:12:35 granpasso kernel: cpu2: on acpi0 Nov 5 12:12:35 granpasso kernel: coretemp2: on cpu2 Nov 5 12:12:35 granpasso kernel: est2: on cpu2 Nov 5 12:12:35 granpasso kernel: p4tcc2: on cpu2 Nov 5 12:12:35 granpasso kernel: cpu3: on acpi0 Nov 5 12:12:35 granpasso kernel: coretemp3: on cpu3 Nov 5 12:12:35 granpasso kernel: est3: on cpu3 Nov 5 12:12:35 granpasso kernel: p4tcc3: on cpu3 Nov 5 12:12:35 granpasso kernel: ichwd0: on isa0 Nov 5 12:12:35 granpasso kernel: ichwd0: Intel ICH9 watchdog timer (ICH9 or equivalent) Nov 5 12:12:35 granpasso kernel: orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xcbfff,0xcc000-0xcdfff on isa0 Nov 5 12:12:35 granpasso kernel: sc0: at flags 0x100 on isa0 Nov 5 12:12:35 granpasso kernel: sc0: VGA <16 virtual consoles, flags=0x300> Nov 5 12:12:35 granpasso kernel: vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Nov 5 12:12:35 granpasso kernel: sio1: configured irq 3 not in bitmap of probed irqs 0 Nov 5 12:12:35 granpasso kernel: sio1: port may not be enabled Nov 5 12:12:35 granpasso kernel: Timecounters tick every 1.000 msec Nov 5 12:12:35 granpasso kernel: ipfw2 initialized, divert enabled, nat loadable, rule-based forwarding disabled, default to deny, logging limited to 100 packets/entry by default Nov 5 12:12:35 granpasso kernel: firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) Nov 5 12:12:35 granpasso kernel: firewire0: bus manager 0 (me) Nov 5 12:12:35 granpasso kernel: ad4: 715404MB at ata2-master SATA150 Nov 5 12:12:35 granpasso kernel: SMP: AP CPU #1 Launched! Nov 5 12:12:35 granpasso kernel: SMP: AP CPU #3 Launched! Nov 5 12:12:35 granpasso kernel: SMP: AP CPU #2 Launched! Nov 5 12:12:35 granpasso kernel: da0 at twa0 bus 0 target 0 lun 0 Nov 5 12:12:35 granpasso kernel: da0: Fixed Direct Access SCSI-5 device Nov 5 12:12:35 granpasso kernel: da0: 100.000MB/s transfers Nov 5 12:12:35 granpasso kernel: da0: 1716552MB (3515498496 512 byte sectors: 255H 63S/T 218829C) Nov 5 12:12:35 granpasso kernel: Trying to mount root from ufs:/dev/da0s1a /* - one minute later - */ Nov 5 12:13:33 granpasso login: ROOT LOGIN (root) ON ttyv0 /* - ten minutes later - */ Nov 5 12:23:47 granpasso kernel: age0: mem 0xf7ec0000-0xf7efffff irq 17 at device 0.0 on pci2 Nov 5 12:23:47 granpasso kernel: age0: PCI device revision : 0x00b0 Nov 5 12:23:47 granpasso kernel: age0: Chip id/revision : 0x9006 Nov 5 12:23:47 granpasso kernel: age0: 1280 Tx FIFO, 2364 Rx FIFO Nov 5 12:23:47 granpasso kernel: age0: MSIX count : 0 Nov 5 12:23:47 granpasso kernel: age0: MSI count : 1 Nov 5 12:23:47 granpasso kernel: age0: Using 1 MSI messages. Nov 5 12:23:47 granpasso kernel: age0: Read request size : 512 bytes. Nov 5 12:23:47 granpasso kernel: age0: TLP payload size : 128 bytes. Nov 5 12:23:47 granpasso kernel: age0: PCI VPD capability not found! Nov 5 12:23:47 granpasso kernel: miibus0: on age0 Nov 5 12:23:47 granpasso kernel: atphy0: PHY 0 on miibus0 Nov 5 12:23:47 granpasso kernel: atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto Nov 5 12:23:47 granpasso kernel: age0: Ethernet address: 00:1d:60:cb:07:c5 Nov 5 12:23:47 granpasso kernel: age0: [FILTER] Nov 5 12:23:47 granpasso kernel: age0: link state changed to DOWN Nov 5 12:23:48 granpasso kernel: age0: interrupt moderation is 100 us. Nov 5 12:23:48 granpasso kernel: age0: interrupt moderation is 100 us. Nov 5 12:23:51 granpasso kernel: age0: link state changed to UP Nov 5 12:24:00 granpasso kernel: age0: interrupt moderation is 100 us. Nov 5 12:24:00 granpasso kernel: age0: link state changed to DOWN Nov 5 12:24:04 granpasso kernel: age0: link state changed to UP From votdev at gmx.de Fri Nov 7 13:02:34 2008 From: votdev at gmx.de (Volker Theile) Date: Fri Nov 7 13:02:42 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: References: <20081107071752.GA5842@icarus.home.lan> Message-ID: <4914A6A1.9050909@gmx.de> I can confirm that. Many FreeNAS users had problems with their HDDs (e.g. with APM, awake disks to access them after they felt to sleep). Increasing timeouts solves the problem in most cases. I think increasing the value BUT allowing the user to set it to a preferred value via sysctrl would be the best solution. I don't understand why adding such an sysctl interface is such an problem for some people. If someone wants to set any other value than the default one HE MUST KNOW what he do and live with the consequences. There are so many other kernel/system variables that can harm the system. Regards Volker Peter Wemm wrote: > On Thu, Nov 6, 2008 at 11:17 PM, Jeremy Chadwick wrote: > [..] > >> As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, and >> is not adjustable without editing the ATA code yourself and increasing >> the value. The FreeNAS folks have made patches available to turn the >> timeout value into a sysctl. >> >> Soren and/or others, please increase this timeout value. Five seconds >> has now been deemed too aggressive a default. And please consider >> migrating the timeout value into a sysctl. >> > > The 5 second timeout has been a problem for quite a while actually. > I've had a number of instances where I've had to increase it to 20 or > 30 seconds when recovering from marginal drives. The longest > "successful" recovery attempt I've seen was 26 seconds, I believe on a > Maxtor drive a few years ago. ("successful" == the drive spent 26 > seconds but eventually successfully read the sector). Even the IBM > death star drives could take much longer than 5 seconds to do a > recovery 5 years ago. 5 seconds has never been a good default. > > I think the timeout should be increased to at least 30 seconds. My > windows box has a timeout that goes for several minutes. > > If there is concern about FreeBSD appearing to hang, I could imagine > that a console warning message could be printed after 5 seconds. But > just say "drive has not yet responded". But give it more time. > > In this day and age we're generally not playing games with udma33 vs > 66, notched cables, poor CRC support etc. SATA seems to have > eliminated all that. Hmm, it might make sense to increase the timeout > on SATA connections to 2 or 3 minutes by default. > > ------------------------------------------------------------------------ > > > Internal Virus Database is out of date. > Checked by AVG - http://www.avg.com > Version: 8.0.175 / Virus Database: 270.8.5/1764 - Release Date: 03.11.2008 07:46 > > From whizzter at gmail.com Fri Nov 7 13:10:16 2008 From: whizzter at gmail.com (Jonas Lund) Date: Fri Nov 7 13:10:23 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: References: <20081107071752.GA5842@icarus.home.lan> Message-ID: <436c7eda0811071249g33a81c75w85b971ad23a9847d@mail.gmail.com> As i'm writing this i'm trying to rescue the contents of another computers disk. Something about the seek heads or something related to that is physically half-broken so the disk might need up to 10 retries just to read a sector, once read however it's usually no problem. I'm using myrescue (running on 6.2 so i don't know if it's included in the current ports but if anyone wants to run it on freebsd i've done the "gruntwork" for porting) so it's not a really big issue with all the timeouts as it'll try to read that sector again later, but had i had the sysctl i would've been a tad happier right now. As for the defaults being a small value i personally think it's better to throw out some messages/errors early on before the disk reaches a catastrophic state (Atleast on 6.2 the kernel will put out a message for each retry without giving faults, maybe more retries before throwing an error maybe?). By catastrpohic state i'm refering to that oh-so-famous google paper that did say that once a disk has started showing errors it doesn't have long to live, but i do trust that conclusion as i've been "warned" by these messages 2 times but ignored them until the disk went really bad. The main thing i'm trying to get through is that early warning and small problems are helluva lot better than big disasters. Thing of it like the oil meter on your car, it's not like you're gonna go out and drive 100s of km's in the wilderness if you know that the car is in a bad state. (Now if only smart info was reliable!) / Jonas 2008/11/7 Peter Wemm : > On Thu, Nov 6, 2008 at 11:17 PM, Jeremy Chadwick wrote: > [..] >> As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, and >> is not adjustable without editing the ATA code yourself and increasing >> the value. The FreeNAS folks have made patches available to turn the >> timeout value into a sysctl. >> >> Soren and/or others, please increase this timeout value. Five seconds >> has now been deemed too aggressive a default. And please consider >> migrating the timeout value into a sysctl. > > The 5 second timeout has been a problem for quite a while actually. > I've had a number of instances where I've had to increase it to 20 or > 30 seconds when recovering from marginal drives. The longest > "successful" recovery attempt I've seen was 26 seconds, I believe on a > Maxtor drive a few years ago. ("successful" == the drive spent 26 > seconds but eventually successfully read the sector). Even the IBM > death star drives could take much longer than 5 seconds to do a > recovery 5 years ago. 5 seconds has never been a good default. > > I think the timeout should be increased to at least 30 seconds. My > windows box has a timeout that goes for several minutes. > > If there is concern about FreeBSD appearing to hang, I could imagine > that a console warning message could be printed after 5 seconds. But > just say "drive has not yet responded". But give it more time. > > In this day and age we're generally not playing games with udma33 vs > 66, notched cables, poor CRC support etc. SATA seems to have > eliminated all that. Hmm, it might make sense to increase the timeout > on SATA connections to 2 or 3 minutes by default. > -- > Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV > "All of this is for nothing if we don't go to the stars" - JMS/B5 > "If Java had true garbage collection, most programs would delete > themselves upon execution." -- Robert Sewell > _______________________________________________ > freebsd-hardware@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hardware > To unsubscribe, send any mail to "freebsd-hardware-unsubscribe@freebsd.org" > From sos at FreeBSD.ORG Fri Nov 7 13:16:28 2008 From: sos at FreeBSD.ORG (=?ISO-8859-1?Q?S=F8ren_Schmidt?=) Date: Fri Nov 7 13:16:35 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: References: <20081107071752.GA5842@icarus.home.lan> Message-ID: <77C223A7-C5FC-45DE-BF1A-3BC7982FA582@FreeBSD.ORG> On 7Nov, 2008, at 20:12 , Peter Wemm wrote: > On Thu, Nov 6, 2008 at 11:17 PM, Jeremy Chadwick > wrote: > [..] >> As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, >> and >> is not adjustable without editing the ATA code yourself and >> increasing >> the value. The FreeNAS folks have made patches available to turn the >> timeout value into a sysctl. >> >> Soren and/or others, please increase this timeout value. Five >> seconds >> has now been deemed too aggressive a default. And please consider >> migrating the timeout value into a sysctl. > > The 5 second timeout has been a problem for quite a while actually. > I've had a number of instances where I've had to increase it to 20 or > 30 seconds when recovering from marginal drives. The longest > "successful" recovery attempt I've seen was 26 seconds, I believe on a > Maxtor drive a few years ago. ("successful" == the drive spent 26 > seconds but eventually successfully read the sector). Even the IBM > death star drives could take much longer than 5 seconds to do a > recovery 5 years ago. 5 seconds has never been a good default. > > I think the timeout should be increased to at least 30 seconds. My > windows box has a timeout that goes for several minutes. > > If there is concern about FreeBSD appearing to hang, I could imagine > that a console warning message could be printed after 5 seconds. But > just say "drive has not yet responded". But give it more time. > > In this day and age we're generally not playing games with udma33 vs > 66, notched cables, poor CRC support etc. SATA seems to have > eliminated all that. Hmm, it might make sense to increase the timeout > on SATA connections to 2 or 3 minutes by default. Actually I do have a patch around that logs the timeout on the console after the normal timeout (5secs), then just goes on to wait for double the timeout and log again etc etc, final timeout was IIRC 60 secs but could be anything. -S?ren From oberman at es.net Fri Nov 7 13:21:50 2008 From: oberman at es.net (Kevin Oberman) Date: Fri Nov 7 13:21:57 2008 Subject: Problem with USB drive errors in recent 7-Stable Message-ID: <20081107212148.1A47245010@ptavv.es.net> I recently started getting errors on a fairly new USB connected SATA drive. Aside from the errors, the system was locking up as any process attempting to access the drive would lock up in disk uninterruptible wait ("D" in ps). I could not shut down the system and had to power it off. (It's a laptop.) After a reboot, I tried to fsck it and that locked up, too. I was able to recover by telling fsck to not fix the truncated inode and fix everything else. Then I ran fsck again and it was successful in fixing the inode. This happened several times. I then bought a new drive and got the identical behavior! It was not the drive. I rolled my kernel back to 9/13/08 and tried again. This time it just worked! No errors or lock up. I suspect that there are two issues. One results in the lock-up when the disk had errors and the other caused the purported disk errors. The latter has been introduced since 9/13/08. The kernel that produced the errors was from 10/21. I also ran a kernel from 10/8 which did not cause me problems, but I'm not sure that I used the USB drive with this kernel. I'll be building a 10/8 kernel later, after I have backed up some data from a failing drive (PATA, not USB, and SMART confirms that the this disk is sick). I will try to track down exactly which change triggered this ugly behavior, but that will take a number of kernel builds, so it will take a while. Has anyone else seen this? Any ideas on what changes might be the most likely cause. Could be USB, CAM, or something else, I guess. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 224 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081107/30423b94/attachment.pgp From koitsu at FreeBSD.org Fri Nov 7 14:01:05 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Fri Nov 7 14:01:12 2008 Subject: Problem with USB drive errors in recent 7-Stable In-Reply-To: <20081107212148.1A47245010@ptavv.es.net> References: <20081107212148.1A47245010@ptavv.es.net> Message-ID: <20081107220102.GA14260@icarus.home.lan> On Fri, Nov 07, 2008 at 01:21:48PM -0800, Kevin Oberman wrote: > I recently started getting errors on a fairly new USB connected SATA > drive. Aside from the errors, the system was locking up as any process > attempting to access the drive would lock up in disk uninterruptible > wait ("D" in ps). I could not shut down the system and had to power it > off. (It's a laptop.) After a reboot, I tried to fsck it and that locked > up, too. I was able to recover by telling fsck to not fix the truncated > inode and fix everything else. Then I ran fsck again and it was > successful in fixing the inode. This happened several times. > > I then bought a new drive and got the identical behavior! It was not the > drive. I rolled my kernel back to 9/13/08 and tried again. This time it just > worked! No errors or lock up. > > I suspect that there are two issues. One results in the lock-up when the > disk had errors and the other caused the purported disk errors. The > latter has been introduced since 9/13/08. The kernel that produced the > errors was from 10/21. I also ran a kernel from 10/8 which did not cause > me problems, but I'm not sure that I used the USB drive with this > kernel. > > I'll be building a 10/8 kernel later, after I have backed up some data > from a failing drive (PATA, not USB, and SMART confirms that the this > disk is sick). I will try to track down exactly which change triggered > this ugly behavior, but that will take a number of kernel builds, so it > will take a while. > > Has anyone else seen this? Any ideas on what changes might be the most > likely cause. Could be USB, CAM, or something else, I guess. Funny you should post this today -- I just spent the past few days dealing with this problem, specifically the kernel being "stuck" when writing to a umass/da device (in my case, USB flash drives). When I say "stuck", I mean the kernel was still responsive: Ctrl-T would report statuses in processes (the states shown were all different) but the processes essentially had "hung". Ctrl-Alt-Esc on the console dropped me to a db> prompt, so it's not as if the machine had frozen/locked up; it was as if some part surrounding the storage subsystem was spinning in a loop. IP traffic still worked as well, but of course anything that accessed disks would hang. Rebooting the box via Ctrl-Alt-Del wouldn't work, because it would get stuck waiting for a bunch of PIDs to end. I switched the box to CURRENT (for a lot of reasons), and one of those was to try out the new USB4BSD (called "USB2" -- not to be confused with the USB2.0 protocol) stack. That simply induced a random kernel panic. However, HPS is fairly certain he found the issue, and it's with bus_dma(9) interaction. Here's the thread: http://lists.freebsd.org/pipermail/freebsd-current/2008-November/thread.html#235 http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000220.html I have not yet tried his patches (I just woke up), but I will in a short while. So far I have a lot more faith in USB4BSD than I do the old stack, simply because there's active work going on in it. (It's ironic that I encountered this issue while working on a document describing how to put FreeBSD i386, amd64, and MS-DOS on a USB flash drive, so one could install FreeBSD from it, or boot MS-DOS for BIOS upgrades) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From rorya+freebsd.org at TrueStep.com Fri Nov 7 14:07:40 2008 From: rorya+freebsd.org at TrueStep.com (Rory Arms) Date: Fri Nov 7 14:08:28 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime In-Reply-To: <1226078239.37011.37.camel@bauer.cse.buffalo.edu> References: <9592E887-75F3-473F-9581-F9C22A9936A6@TrueStep.com> <1226078239.37011.37.camel@bauer.cse.buffalo.edu> Message-ID: <510F0121-0E24-42C0-BC77-D61DF9FEA46C@TrueStep.com> On 2008-11-07, at 12:17 , Ken Smith wrote: > On Fri, 2008-11-07 at 00:00 -0500, Rory Arms wrote: >> Well, if I can assist with further debugging, let me know. > > The person who followed up with a list of things that *may* have made > the problem go away mentioned one of the things was disabling powerd. > Do you have that enable, and if yes would you mind disabling it to see > if that's the culprit? Hi Ken, No, it's not running powerd. Looking at the process list, it looks like GNOME has some power related processes, but I assume that's different. I don't think I'd ever even heard of powerd till now, and have now read the manual to learn about it. Well, I wonder if the panic I had yesterday a few minutes after booting RC2 was a fluke. The computer has been running so far without problems, using GNOME, on that second RC2 bootup, for almost 24 hours now. However, I still find it concerning that kgdb(1) on 6.4, hasn't been able to open any of the core dump though. If this is broken, it will be difficult to provide bug reports. I wonder if it's just something about this particular code path to panic, that's generating this, I guess, corrupt core file. Is there a key sequence, or sysctl knob perhaps, that I can use to force an artificial panic to see if the coredump generated is any different? - rory From vermaden at interia.pl Fri Nov 7 17:47:13 2008 From: vermaden at interia.pl (vermaden) Date: Fri Nov 7 17:47:23 2008 Subject: sysctl debug.cpufreq.highest Message-ID: <20081108012734.3274F1E3055@f03.poczta.interia.pl> Hi, Currently there is possibility to set lowest speed of cpu for scaling with cpufreq (debug.cpufreq.lowest), it would be good to include also a option to set the highest possible freq to use with cpufreq, some laptops get too hot and/or consume too much power when running on maximum power/speed of cpu. Regards vermaden ---------------------------------------------------------------------- Dzwon taniej na zagraniczne komorki! Sprawdz >> http://link.interia.pl/f1f6a From barbara.xxx1975 at libero.it Fri Nov 7 17:58:06 2008 From: barbara.xxx1975 at libero.it (Barbara) Date: Fri Nov 7 17:58:13 2008 Subject: R: Re: 6.4-RC2 crashes after a few minutes of uptime Message-ID: <25487728.863151226109459073.JavaMail.root@wmail32> > >The person who followed up with a list of things that *may* have made >the problem go away mentioned one of the things was disabling powerd. >Do you have that enable, and if yes would you mind disabling it to see >if that's the culprit? > >Thanks for the report. Hi, it's the person speaking ;) It seems that I spoke too early. About an hour ago my box hung, but this time it didn't panicked (it isn't since ~Oct. 12). And as confirmed by Rory, it's seems that powerd isn't responsible. The only thing I was able to do has been switching to ttyv0 but after entering my login, it didn't prompted for the password. In the meanwhile, messages similar to the following were popping out: acd0: WARNING - PREVENT_ALLOW taskqueue timeout - completing request directly acd0: WARNING - PREVENT_ALLOW freeing taskqueue zombie request acd0: WARNING - TEST_UNIT_READY taskqueue timeout - completing request directly acd0: WARNING - TEST_UNIT_READY freeing taskqueue zombie request Again, as I've reported in http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 , I was not using acd0 and I never did since the box had been turned on. The box was replying if pinged, but I was unable to access it via ssh, so I had to press the reset button. What happened is similar to what is described here: http: //lists.freebsd.org/pipermail/freebsd-ports/2006-December/037796.html And here http://www.freebsd.org/cgi/query-pr.cgi?pr=110015 I can see another swi6 panic with the same message in the kernel buffer (acd0: WARNING - PREVENT_ALLOW read data overrun 18>0) I had in pr. Isn't my backtrace of any help in tracking down the problem? From pyunyh at gmail.com Sat Nov 8 00:51:35 2008 From: pyunyh at gmail.com (Pyun YongHyeon) Date: Sat Nov 8 00:51:43 2008 Subject: Panics and freeze using age0 In-Reply-To: <4914A46F.2000309@sssup.it> References: <49133B35.9040603@sssup.it> <20081107072201.GD11486@cdnetworks.co.kr> <4913EF93.6010006@sssup.it> <20081107090125.GG11486@cdnetworks.co.kr> <4914A46F.2000309@sssup.it> Message-ID: <20081108084850.GF14970@cdnetworks.co.kr> On Fri, Nov 07, 2008 at 12:26:23PM -0800, Gabriele Cecchetti wrote: > Pyun YongHyeon ha scritto: > >Would you show me dmesg output? > > > > This is the dmesg output, note that age0 is detected after superuser login > who start ifconfi-ing. > > Nov 5 12:12:35 granpasso kernel: Copyright (c) 1979, 1980, 1983, 1986, > 1988, 1989, 1991, 1992, 1993, 1994 > Nov 5 12:12:35 granpasso kernel: The Regents of the University of > California. All rights reserved. > Nov 5 12:12:35 granpasso kernel: FreeBSD is a registered trademark of The > FreeBSD Foundation. > Nov 5 12:12:35 granpasso kernel: FreeBSD 7.1-PRERELEASE #2: Wed Oct 1 > 17:30:35 CEST 2008 > Nov 5 12:12:35 granpasso kernel: > root@granpasso.retis:/usr/obj/usr/src/sys/GRANPASSOv3 > Nov 5 12:12:35 granpasso kernel: Timecounter "i8254" frequency 1193182 Hz > quality 0 > Nov 5 12:12:35 granpasso kernel: CPU: Intel(R) Core(TM)2 Quad CPU Q6600 > @ 2.40GHz (2400.10-MHz K8-class CPU) > Nov 5 12:12:35 granpasso kernel: Origin = "GenuineIntel" Id = 0x6fb > Stepping = 11 > Nov 5 12:12:35 granpasso kernel: > Features=0xbfebfbff H,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > Nov 5 12:12:35 granpasso kernel: > Features2=0xe3bd > Nov 5 12:12:35 granpasso kernel: AMD Features=0x20100800 > Nov 5 12:12:35 granpasso kernel: AMD Features2=0x1 > Nov 5 12:12:35 granpasso kernel: Cores per package: 4 > Nov 5 12:12:35 granpasso kernel: usable memory = 4285071360 (4086 MB) > Nov 5 12:12:35 granpasso kernel: avail memory = 4131549184 (3940 MB) > Nov 5 12:12:35 granpasso kernel: ACPI APIC Table: > Nov 5 12:12:35 granpasso kernel: FreeBSD/SMP: Multiprocessor System > Detected: 4 CPUs [...] > Nov 5 12:12:35 granpasso kernel: SMP: AP CPU #1 Launched! > Nov 5 12:12:35 granpasso kernel: SMP: AP CPU #3 Launched! > Nov 5 12:12:35 granpasso kernel: SMP: AP CPU #2 Launched! > Nov 5 12:12:35 granpasso kernel: da0 at twa0 bus 0 target 0 lun 0 > Nov 5 12:12:35 granpasso kernel: da0: Fixed > Direct Access SCSI-5 device Nov 5 12:12:35 granpasso kernel: da0: > 100.000MB/s transfers > Nov 5 12:12:35 granpasso kernel: da0: 1716552MB (3515498496 512 byte > sectors: 255H 63S/T 218829C) > Nov 5 12:12:35 granpasso kernel: Trying to mount root from ufs:/dev/da0s1a > /* - one minute later - */ > Nov 5 12:13:33 granpasso login: ROOT LOGIN (root) ON ttyv0 > /* - ten minutes later - */ > Nov 5 12:23:47 granpasso kernel: age0: Gigabit Ethernet> mem 0xf7ec0000-0xf7efffff irq 17 at device 0.0 on pci2 > Nov 5 12:23:47 granpasso kernel: age0: PCI device revision : 0x00b0 > Nov 5 12:23:47 granpasso kernel: age0: Chip id/revision : 0x9006 > Nov 5 12:23:47 granpasso kernel: age0: 1280 Tx FIFO, 2364 Rx FIFO > Nov 5 12:23:47 granpasso kernel: age0: MSIX count : 0 > Nov 5 12:23:47 granpasso kernel: age0: MSI count : 1 > Nov 5 12:23:47 granpasso kernel: age0: Using 1 MSI messages. > Nov 5 12:23:47 granpasso kernel: age0: Read request size : 512 bytes. > Nov 5 12:23:47 granpasso kernel: age0: TLP payload size : 128 bytes. > Nov 5 12:23:47 granpasso kernel: age0: PCI VPD capability not found! > Nov 5 12:23:47 granpasso kernel: miibus0: on age0 > Nov 5 12:23:47 granpasso kernel: atphy0: PHY > 0 on miibus0 > Nov 5 12:23:47 granpasso kernel: atphy0: 10baseT, 10baseT-FDX, 100baseTX, > 100baseTX-FDX, 1000baseT-FDX, auto > Nov 5 12:23:47 granpasso kernel: age0: Ethernet address: 00:1d:60:cb:07:c5 > Nov 5 12:23:47 granpasso kernel: age0: [FILTER] > Nov 5 12:23:47 granpasso kernel: age0: link state changed to DOWN > Nov 5 12:23:48 granpasso kernel: age0: interrupt moderation is 100 us. > Nov 5 12:23:48 granpasso kernel: age0: interrupt moderation is 100 us. > Nov 5 12:23:51 granpasso kernel: age0: link state changed to UP > Nov 5 12:24:00 granpasso kernel: age0: interrupt moderation is 100 us. > Nov 5 12:24:00 granpasso kernel: age0: link state changed to DOWN > Nov 5 12:24:04 granpasso kernel: age0: link state changed to UP I don't see age(4) attach failure message in your output. Does your kernel configuration file have "device age" entry? When you don't have "device age" entry in kernel configuration and if you run "ifconfig age0", ifconfig(8) will try to load age(4) kernel module. -- Regards, Pyun YongHyeon From volker at vwsoft.com Sat Nov 8 04:32:07 2008 From: volker at vwsoft.com (Volker) Date: Sat Nov 8 04:32:14 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <4912E462.4090608@icyb.net.ua> References: <4912E462.4090608@icyb.net.ua> Message-ID: <491586B9.2020303@vwsoft.com> Andriy, On 12/23/-58 20:59, Andriy Gapon wrote: > I have a quite strange problem. > This is with 7-BETA amd64. Did it work with earlier versions? > All of USB is out of kernel and is loaded via modules. > BIOS has "Legacy USB" enabled. > I have only a USB keyboard, no PS/2 port. Can you check BIOS settings for EHCI handover? If the BIOS does not have handover enabled, it may disable legacy support after a timeout, which is often bad. IMO this is the same with booting off USB drives but every BIOS handles that different. > The keyboard works file in BIOS and for selecting boot device in boot0 > menu. It also works in loader menu. If in the menu I select to go to > loader prompt then it works for about 5 seconds and then "dies" - no > reaction to key presses, no led change, nothing. > I haven't actually verified if the keyboard would still work if I stayed > in loader menu for longer than ~10 seconds. > > This doesn't happen if USB is built into kernel. That sound strange. I have no idea why that might work (or I'm totally wrong with my handover theory). > Weird... Yes, sounds like or it's probably easily explainable ;) Volker From fbsd-stable-0 at ml.turing-complete.org Sat Nov 8 05:55:38 2008 From: fbsd-stable-0 at ml.turing-complete.org (Nicolas Rachinsky) Date: Sat Nov 8 05:55:45 2008 Subject: Block device In-Reply-To: <49107933.7070907@samsco.org> References: <004901c93e8a$1b556500$639049d9@EC1a> <20081104145144.GB14539@hugo10.ka.punkt.de> <49107933.7070907@samsco.org> Message-ID: <20081108133756.GA26413@mid.pc5.i.0x5.de> * Scott Long [2008-11-04 09:32 -0700]: > 1. disk access in the driver layer still happens on a block basis. It's > true that to the application layer, the device has character dev > semantics, meaning that arbitrary numbers of bytes can be accessed > randomly without any restrictions. But deep down inside the kernel, > it's still doing block-by-block access. Isn't it the other way round? It has character device semantics, thus you cannot do reads and writes of arbitrary size on arbitrary positions? # dd if=/dev/ad4 bs=1 count=1 dd: /dev/ad4: Invalid argument root@pc5 ~# dd if=/dev/ad4 bs=512 count=1 >/dev/null 1+0 records in 1+0 records out 512 bytes transferred in 0.000173 secs (2957966 bytes/sec) Nicolas -- http://www.rachinsky.de/nicolas From thomas at gibfest.dk Sat Nov 8 07:16:20 2008 From: thomas at gibfest.dk (Thomas Rasmussen) Date: Sat Nov 8 07:16:27 2008 Subject: Problem with USB drive errors in recent 7-Stable In-Reply-To: <20081107220102.GA14260@icarus.home.lan> References: <20081107212148.1A47245010@ptavv.es.net> <20081107220102.GA14260@icarus.home.lan> Message-ID: <4915A5B3.2010106@gibfest.dk> Jeremy Chadwick wrote: > (It's ironic that I encountered this issue while working on a document > describing how to put FreeBSD i386, amd64, and MS-DOS on a USB flash > drive, so one could install FreeBSD from it, or boot MS-DOS for BIOS > upgrades) > > Hello, A bit offtopic but may I just say: That sounds great! I've been looking for a document like that. The PCBSD guys have a .img on their site you can just dd to the flash drive, but I haven't been able to get something similar working with plain FreeBSD. It would be very neat to be able to have a couple of USB sticks in the laptop bag with the various "current" installations. Good stuff! Best regards Thomas Rasmussen From ivoras at freebsd.org Sat Nov 8 10:21:11 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Sat Nov 8 10:21:19 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: Ken Chen wrote: > I capture something. > > Please check the PID 57776. It's CPU time never change since my previous > mail here. > > web4# ps alx | grep php-cgi | grep -v grep | grep sbwait > 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 > /usr/local/bin/php-cgi You're right and it is strange. I don't know why this would happen but some things that come to mind are: * Does lighttpd have some kind of status page or a diagnostic utility to show you the states of FastCGI processes? * Does lighttpd have a facility to "reap" old PHP processes? For example, mod_fcgid has a maximum lifetime setting for FastCGI processes. * You could try sending a SIGABRT to the php-cgi process to get a core dump and inspect it. Without debugging symbols it will probably give you the name of the function it's been waiting in. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081108/6675ffa2/signature.pgp From spawk at acm.poly.edu Sat Nov 8 10:43:57 2008 From: spawk at acm.poly.edu (Boris Kochergin) Date: Sat Nov 8 10:44:04 2008 Subject: sysctl debug.cpufreq.highest In-Reply-To: <20081108012734.3274F1E3055@f03.poczta.interia.pl> References: <20081108012734.3274F1E3055@f03.poczta.interia.pl> Message-ID: <4915D74B.2070502@acm.poly.edu> I've rolled a patchset to do this for 7.0-RELEASE (http://acm.poly.edu/~spawk/cpufreq/) if anyone's interested. -Boris vermaden wrote: > Hi, > > Currently there is possibility to set lowest speed of > cpu for scaling with cpufreq (debug.cpufreq.lowest), > it would be good to include also a option to set the > highest possible freq to use with cpufreq, some laptops > get too hot and/or consume too much power when running > on maximum power/speed of cpu. > > Regards > vermaden > > > ---------------------------------------------------------------------- > Dzwon taniej na zagraniczne komorki! > Sprawdz >> http://link.interia.pl/f1f6a > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From traveling08 at cox.net Sat Nov 8 11:55:21 2008 From: traveling08 at cox.net (Robert) Date: Sat Nov 8 11:55:27 2008 Subject: AbiWord & CUPS Message-ID: <20081108113907.5a2affee@asus64> Greetings I have evidently done something wrong during installation of CUPS or AbiWord. I am unable to print a document from AbiWord. I can print from the command line using lpr. I can print from Evince, Firefox, Claws-Mail and other programs. When attempting to print from AbiWord either to the designated printer or to Generic Postscript (lpr) all seems well. The printer starts, feeds the paper but nothing is printed. If the document is one page in length, I get one blank page. If the document is 4 pages in length then I get four blank pages. All ports are up to date and I am running 7 stable from yesterday. I have reinstalled cups and AbiWord to no avail. Any help would be greatly appreciated. Robert Oh, I am using XFCE4 From mdh_lists at yahoo.com Sat Nov 8 09:36:58 2008 From: mdh_lists at yahoo.com (mdh) Date: Sat Nov 8 12:48:57 2008 Subject: host(1) problem with -6 option Message-ID: <94894.60594.qm@web56805.mail.re3.yahoo.com> Howdy folks, I'm having a little trouble understanding a problem that the `host` command in RELENG_7_0 (very recent) is having. This is by and large my first time working with IPv6, which I've been meaning to learn for some time. First off, I've got my zone file configured to return a AAAA record for x1.mydomain and named isn't complaining. However, when I run `host -6 x1.mydomain`, host returns the following output: (root@rapier) [/etc/namedb]: host -6 x1.mydomain /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:127.0.0.1#53: Invalid argument /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:IP.IP.IP.8#53: Invalid argument /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:127.0.0.1#53: Invalid argument /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:IP.IP.IP.8#53: Invalid argument ;; connection timed out; no servers could be reached IP.IP.IP.8 is my ISP's DNS server, and is a third option just in case the localhost DNS server crashes or goes batty while I'm out drinking or somesuch. Here's my resolv.conf, which shows ::1 listed as the second nameserver entry - however, it seems host -6 never even tries it. domain mydomain search mydomain nameserver 127.0.0.1 nameserver ::1 nameserver IP.IP.IP.8 The DNS server running on localhost is authoritative for mydomain. I can ping it via localhost using both v4 and v6, and I can also ping the external v4 and v6 addresses just fine remotely. Worth noting is that host without the -6 option resolves the v6 addresses just fine, however it seems like it should work properly with the -6 option as well. It is likely doing so via the IPv4 nameserver address, since that is the first nameserver specified in resolv.conf. This may be a bug deserving a PR, but I'm not entirely sure, so I thought to check here first. As I said, I'm new to IPv6, but this behavior seems to be counterintuitive. Am I just doing it wrong? Note: I'm not on -stable, so please CC: me on responses. Thanks, Matt From markir at paradise.net.nz Sat Nov 8 13:30:22 2008 From: markir at paradise.net.nz (Mark Kirkwood) Date: Sat Nov 8 13:30:32 2008 Subject: anoncvs1.FreeBSD.org broken? Message-ID: <49160168.50009@paradise.net.nz> I trying to update /usr/src via cvs from anoncvs1.FreeBSD.org and getting: $ cd /usr/src $ cat CVS/Root :ext:anoncvs@anoncvs1.FreeBSD.org:/home/ncvs $ cvs update -d -P cannot close CVS/Entries No space left on device From markir at paradise.net.nz Sat Nov 8 13:31:53 2008 From: markir at paradise.net.nz (Mark Kirkwood) Date: Sat Nov 8 13:32:00 2008 Subject: anoncvs1.FreeBSD.org broken? In-Reply-To: <49160168.50009@paradise.net.nz> References: <49160168.50009@paradise.net.nz> Message-ID: <491601C4.8050802@paradise.net.nz> Ah - too quick with the "send" button - sorry about the poor English. I wrote: > I trying to update /usr/src ... From lehmann at ans-netz.de Sun Nov 9 00:20:54 2008 From: lehmann at ans-netz.de (Oliver Lehmann) Date: Sun Nov 9 00:21:01 2008 Subject: libzfs: vmem.h missing Message-ID: <20081109092051.99e8d97c.lehmann@ans-netz.de> Hi, when I try to compile an actual 7 STABLE Im getting: ===> cddl/lib/libzfs (depend) rm -f .depend mkdep -f .depend -a -DZFS_NO_ACL -I/usr/src/cddl/lib/libzfs/../../../sbin/mount -I/usr/src/cddl/lib/libzfs/../../../cddl/lib/libumem -I/usr/src/cddl/lib/libzfs/../../../sys/cddl/compat/opensolaris -I/usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/include -I/usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/lib/libumem -I/usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzpool/common -I/usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/common/zfs -I/usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs -I/usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/uts/common/sys -I/usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/head -I/usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/uts/common -I/usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libnvpair -I/usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libuutil/common -I/usr/src/cddl/lib/libzfs/../.. /../cddl/contrib/opensolaris/lib/libzfs/common -D_SOLARIS_C_SOURCE /usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/misc/deviceid.c /usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/misc/mnttab.c /usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/misc/mkdirp.c /usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/misc/zmount.c /usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/misc/fsshare.c /usr/src/cddl/lib/libzfs/../../../cddl/compat/opensolaris/misc/zone.c /usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c /usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_graph.c /usr/src/cddl/lib/libzfs/../../../cdd l/contrib/opensolaris/lib/libzfs/common/libzfs_mount.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_changelist.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_config.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c In file included from /usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h:32, from /usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/common/zfs/zfs_prop.c:50: /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:64:18: error: vmem.h: No such file or directory In file included from /usr/src/cddl/lib/libzfs/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h:32, from /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c:44: /usr/src/cddl/lib/libzfs/../../../cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:64:18: error: vmem.h: No such file or directory ..... why is vmem.h missing? -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ From m.seaman at infracaninophile.co.uk Sun Nov 9 03:00:33 2008 From: m.seaman at infracaninophile.co.uk (Matthew Seaman) Date: Sun Nov 9 03:00:40 2008 Subject: host(1) problem with -6 option In-Reply-To: <94894.60594.qm@web56805.mail.re3.yahoo.com> References: <94894.60594.qm@web56805.mail.re3.yahoo.com> Message-ID: <4916C2B8.6080702@infracaninophile.co.uk> mdh wrote: > Howdy folks, > I'm having a little trouble understanding a problem that the `host` > command in RELENG_7_0 (very recent) is having. This is by and large > my first time working with IPv6, which I've been meaning to learn for > some time. First off, I've got my zone file configured to return a > AAAA record for x1.mydomain and named isn't complaining. However, > when I run `host -6 x1.mydomain`, host returns the following output: > > (root@rapier) [/etc/namedb]: host -6 x1.mydomain > /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:127.0.0.1#53: Invalid argument > /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:IP.IP.IP.8#53: Invalid argument > /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:127.0.0.1#53: Invalid argument > /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:1179: internal_send: ::ffff:IP.IP.IP.8#53: Invalid argument > ;; connection timed out; no servers could be reached It's the way resolv.conf works. Consider this -- happy-idiot-talk:~:% cat /etc/resolv.conf domain infracaninophile.co.uk nameserver ::1 nameserver 127.0.0.1 Which is all fine and dandy when told to use IPv6: happy-idiot-talk:~:% host -6 h-i-t.infracaninophile.co.uk h-i-t.infracaninophile.co.uk is an alias for happy-idiot-talk.infracaninophile.co.uk. happy-idiot-talk.infracaninophile.co.uk has address 81.187.76.162 happy-idiot-talk.infracaninophile.co.uk has IPv6 address 2001:8b0:151:1:240:5ff:fea5:8db7 happy-idiot-talk.infracaninophile.co.uk mail is handled by 10 smtp.infracaninophile.co.uk. but goes tits-up when told to use IPv4: happy-idiot-talk:~:% host -4 h-i-t.infracaninophile.co.uk host: couldn't get address for '::1': address family not supported nameserver entries in resolv.conf are tried in the order given. Using the -4 or -6 switches to host(1) forces it to try each of the listed nameserver addresses by the stated protocol. It makes no sense at all to try and access an IPv6 address using IPv4 transport, and trying the converse: an IPv4 address via IPv6, will either fail or try and use IPv4-mapped addresses. You might think that the '-4' and '-6' flags to host(1) are pretty much useless in that case, but they work fine when you also tell host(1) the domain name of a nameserver to use[*]: happy-idiot-talk:~:% host -4 h-i-t.infracaninophile.co.uk localhost Using domain server: Name: localhost Address: 127.0.0.1#53 Aliases: h-i-t.infracaninophile.co.uk is an alias for happy-idiot-talk.infracaninophile.co.uk. happy-idiot-talk.infracaninophile.co.uk has address 81.187.76.162 happy-idiot-talk.infracaninophile.co.uk has IPv6 address 2001:8b0:151:1:240:5ff:fea5:8db7 happy-idiot-talk.infracaninophile.co.uk mail is handled by 10 smtp.infracaninophile.co.uk. happy-idiot-talk:~:% host -6 h-i-t.infracaninophile.co.uk localhost Using domain server: Name: localhost Address: ::1#53 Aliases: h-i-t.infracaninophile.co.uk is an alias for happy-idiot-talk.infracaninophile.co.uk. happy-idiot-talk.infracaninophile.co.uk has address 81.187.76.162 happy-idiot-talk.infracaninophile.co.uk has IPv6 address 2001:8b0:151:1:240:5ff:fea5:8db7 happy-idiot-talk.infracaninophile.co.uk mail is handled by 10 smtp.infracaninophile.co.uk. Arguably it is a bug for host(1) to give up on the first entry in /etc/resolv.conf when told to use a conflicting address type. However, that should be reported to ISC rather than the FreeBSD project. When allowed to determine the transport type automatically everything works as expected. Cheers, Matthew [*] but only if the domain name of the nameserver you want to query can be resolved by means that don't fall foul of the same IPv4 vs IPv6 problem. Which boils down to using other than the DNS -- eg. /etc/hosts -- to find the nameserver address. -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081109/c82da5e2/signature.pgp From lists at lozenetz.org Sun Nov 9 10:27:30 2008 From: lists at lozenetz.org (Anton - Valqk) Date: Sun Nov 9 10:27:37 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: Message-ID: <491726F4.4040808@lozenetz.org> You can try taking look to lighttpd status and fcgi processes status like this: server.modules += ( "mod_status" ) status.status-url = "/server-status" status.statistics-url = "/sstatus1" status.statistics-url gives info for each fastcgi like this: fastcgi.active-requests: 0 fastcgi.backend.backend1.0.connected: 12493970 fastcgi.backend.backend1.0.died: 0 fastcgi.backend.backend1.0.disabled: 0 fastcgi.backend.backend1.0.load: 0 fastcgi.backend.backend1.0.overloaded: 0 fastcgi.backend.backend1.load: 1 fastcgi.requests: 19479062 etc... read what each means on lighttpd site... pls tell what caused this, it'd be very interesting to me! cheers, valqk. Ken Chen wrote: > I capture something. > > Please check the PID 57776. It's CPU time never change since my previous > mail here. > > web4# ps alx | grep php-cgi | grep -v grep | grep sbwait > 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 > /usr/local/bin/php-cgi > 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 > /usr/local/bin/php-cgi > 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 > /usr/local/bin/php-cgi > 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 > /usr/local/bin/php-cgi > 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 > /usr/local/bin/php-cgi > 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 > /usr/local/bin/php-cgi > 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 > /usr/local/bin/php-cgi > 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 > /usr/local/bin/php-cgi > 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 > /usr/local/bin/php-cgi > 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 > /usr/local/bin/php-cgi > 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 > /usr/local/bin/php-cgi > 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 > /usr/local/bin/php-cgi > 65534 66007 47240 0 4 0 183352 90960 sbwait I ?? 1:16.81 > /usr/local/bin/php-cgi > 65534 66014 47240 5 4 0 182328 92748 sbwait S ?? 1:41.23 > /usr/local/bin/php-cgi > 65534 66038 47240 1 4 0 182328 91900 sbwait I ?? 1:38.04 > /usr/local/bin/php-cgi > 65534 66060 47240 0 4 0 182328 90048 sbwait I ?? 1:15.46 > /usr/local/bin/php-cgi > 65534 66078 47240 3 4 0 182328 92224 sbwait S ?? 1:39.66 > /usr/local/bin/php-cgi > web4# top -b > last pid: 70768; load averages: 1.62, 1.65, 1.43 up 4+15:56:06 > 22:53:48 > 85 processes: 1 running, 84 sleeping > > Mem: 492M Active, 1204M Inact, 218M Wired, 60M Cache, 112M Buf, 27M Free > Swap: 2019M Total, 20K Used, 2019M Free > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 69544 nobody 1 8 0 203M 38500K nanslp 1 6:31 11.33% php > 47290 nobody 1 4 0 101M 98M kqread 1 30:42 2.98% lighttpd > 66526 nobody 1 4 0 178M 92796K accept 1 1:40 1.12% php-cgi > 66077 nobody 1 4 0 178M 92512K accept 0 1:49 1.07% php-cgi > 65921 nobody 1 4 0 178M 92696K accept 0 1:43 0.98% php-cgi > 65968 nobody 1 4 0 178M 92484K accept 0 1:43 0.93% php-cgi > 66017 nobody 1 4 0 178M 92444K accept 0 1:50 0.88% php-cgi > 65979 nobody 1 4 0 178M 92676K accept 1 1:44 0.88% php-cgi > 66424 nobody 1 4 0 178M 92928K accept 1 1:36 0.88% php-cgi > 65938 nobody 1 4 0 178M 92336K accept 1 1:52 0.73% php-cgi > 65951 nobody 1 4 0 178M 92704K accept 0 1:48 0.73% php-cgi > 66016 nobody 1 4 0 178M 92232K accept 1 1:41 0.73% php-cgi > 65950 nobody 1 4 0 178M 93192K accept 0 1:51 0.68% php-cgi > 65999 nobody 1 4 0 178M 92940K accept 1 1:46 0.63% php-cgi > 66008 nobody 1 4 0 178M 93000K accept 1 1:46 0.63% php-cgi > 69286 nobody 1 4 0 178M 92208K accept 1 0:37 0.63% php-cgi > 47289 nobody 1 4 0 73400K 70640K kqread 1 12:02 0.59% lighttpd > 65980 nobody 1 4 0 178M 93156K accept 1 1:51 0.59% php-cgi > 2008/11/7 Ivan Voras > > >> Ken Chen wrote: >> >>> Oh.. sorry, I forgot to provide the information of my environment. >>> >>> web4# php-cgi -v >>> PHP 5.2.6 (cgi-fcgi) (built: Nov 2 2008 11:16:30) >>> Copyright (c) 1997-2008 The PHP Group >>> Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies >>> with XCache v1.2.2, Copyright (c) 2005-2007, by mOo >>> web4# /usr/local/lighttpd/sbin/lighttpd -v >>> lighttpd-1.4.19 - a light and fast webserver >>> Build-Date: Sep 1 2008 16:58:51 >>> web4# uname -a >>> FreeBSD web4.xxxx.com 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #11: Mon Nov >>> >> 3 >> >>> 01:10:36 CST 2008 root@web4.xxxx.com:/usr/obj/usr/src/sys/WEB4 i386 >>> web4# ps alx | grep php-cgi | grep -v grep | grep sbwait >>> 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 >>> /usr/local/bin/php-cgi >>> 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 >>> /usr/local/bin/php-cgi >>> 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 >>> /usr/local/bin/php-cgi >>> 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 >>> /usr/local/bin/php-cgi >>> 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 >>> /usr/local/bin/php-cgi >>> 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 >>> /usr/local/bin/php-cgi >>> 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 >>> /usr/local/bin/php-cgi >>> 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 >>> /usr/local/bin/php-cgi >>> 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 >>> /usr/local/bin/php-cgi >>> 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 >>> /usr/local/bin/php-cgi >>> 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 >>> /usr/local/bin/php-cgi >>> 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 >>> /usr/local/bin/php-cgi >>> >> This does seem a bit unusual, but seeing that your execution times are >> not null it might that the PHP servers are actually doing some useful >> work. You should have a mixture of various states in PHP - do they show >> up in top? >> >> My own example is: >> >> last pid: 77421; load averages: 2.82, 2.59, 2.13 >> up >> 55+16:58:49 15:48:16 >> 209 processes: 2 running, 206 sleeping, 1 zombie >> CPU: 49.8% user, 0.0% nice, 2.8% system, 0.0% interrupt, 47.4% idle >> Mem: 1493M Active, 1583M Inact, 278M Wired, 139M Cache, 112M Buf, 505M Free >> Swap: 4500M Total, 416M Used, 4084M Free, 9% Inuse >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> 75863 www 1 4 0 162M 50020K sbwait 3 2:54 36.77% php-cgi >> 76830 www 1 103 0 156M 41556K CPU2 3 1:28 36.77% php-cgi >> 76834 www 1 4 0 163M 56628K sbwait 0 2:23 33.59% php-cgi >> 76019 www 1 4 0 150M 38948K accept 3 3:12 20.56% php-cgi >> 76825 www 1 4 0 158M 42912K accept 2 1:21 18.16% php-cgi >> 76846 www 1 4 0 162M 42600K sbwait 1 1:07 14.36% php-cgi >> 76835 www 1 4 0 151M 39948K accept 2 1:28 12.60% php-cgi >> 76829 www 1 4 0 150M 36564K sbwait 2 1:46 2.98% php-cgi >> >> This is unusually high load, a spike, for this server but it has many >> cores and it's stable. It's also running 7.1-PRERELEASE. >> >> >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > From vermaden at interia.pl Sun Nov 9 13:09:29 2008 From: vermaden at interia.pl (vermaden) Date: Sun Nov 9 13:09:37 2008 Subject: sysctl debug.cpufreq.highest Message-ID: <20081109210925.ED4CF46CCF0@f46.poczta.interia.pl> Thank You mate. Works like a charm, why not merge it into STABLE and CURRENT branches? I assume that STABLE is currently frozen 'cause of the RELEASE process, but such a small (and tested as 'lowest') change should fit into RELEASE. Regards vermaden > I've rolled a patchset to do this for 7.0-RELEASE > (http://acm.poly.edu/~spawk/cpufreq/) if anyone's interested. > > -Boris > > vermaden wrote: > > Hi, > > > > Currently there is possibility to set lowest speed of > > cpu for scaling with cpufreq (debug.cpufreq.lowest), > > it would be good to include also a option to set the > > highest possible freq to use with cpufreq, some laptops > > get too hot and/or consume too much power when running > > on maximum power/speed of cpu. > > > > Regards > > vermaden ---------------------------------------------------------------------- Dzwon taniej na zagraniczne komorki! Sprawdz >> http://link.interia.pl/f1f6a From joe at zircon.seattle.wa.us Sun Nov 9 15:49:37 2008 From: joe at zircon.seattle.wa.us (Joe Kelsey) Date: Sun Nov 9 16:05:42 2008 Subject: Western Digital hard disks and ATA timeouts In-Reply-To: <77C223A7-C5FC-45DE-BF1A-3BC7982FA582@FreeBSD.ORG> References: <20081107071752.GA5842@icarus.home.lan> <77C223A7-C5FC-45DE-BF1A-3BC7982FA582@FreeBSD.ORG> Message-ID: <49177244.9060802@zircon.seattle.wa.us> S?ren Schmidt wrote: > On 7Nov, 2008, at 20:12 , Peter Wemm wrote: > >> On Thu, Nov 6, 2008 at 11:17 PM, Jeremy Chadwick >> wrote: >> [..] >>> As stated, FreeBSD's ATA command timeout is hard-set to 5 seconds, and >>> is not adjustable without editing the ATA code yourself and increasing >>> the value. The FreeNAS folks have made patches available to turn the >>> timeout value into a sysctl. >>> >>> Soren and/or others, please increase this timeout value. Five seconds >>> has now been deemed too aggressive a default. And please consider >>> migrating the timeout value into a sysctl. >> >> The 5 second timeout has been a problem for quite a while actually. >> I've had a number of instances where I've had to increase it to 20 or >> 30 seconds when recovering from marginal drives. The longest >> "successful" recovery attempt I've seen was 26 seconds, I believe on a >> Maxtor drive a few years ago. ("successful" == the drive spent 26 >> seconds but eventually successfully read the sector). Even the IBM >> death star drives could take much longer than 5 seconds to do a >> recovery 5 years ago. 5 seconds has never been a good default. >> >> I think the timeout should be increased to at least 30 seconds. My >> windows box has a timeout that goes for several minutes. >> >> If there is concern about FreeBSD appearing to hang, I could imagine >> that a console warning message could be printed after 5 seconds. But >> just say "drive has not yet responded". But give it more time. >> >> In this day and age we're generally not playing games with udma33 vs >> 66, notched cables, poor CRC support etc. SATA seems to have >> eliminated all that. Hmm, it might make sense to increase the timeout >> on SATA connections to 2 or 3 minutes by default. > > Actually I do have a patch around that logs the timeout on the console > after the normal timeout (5secs), then just goes on to wait for double > the timeout and log again etc etc, final timeout was IIRC 60 secs but > could be anything. I have a disk which I am finally getting rid of that produces READ_DMA and WRITE_DMA errors at a pretty high rate. I did enable the extra ATA error reporting and it doesn't seem to indicate any sort of actual errors, just extra long itmeouts. At one time, I did change the system to extend the timeout, but I did not see any real improvement at 30 seconds. I suspect that an even more extended timeout would be necessary to solve the problem. I am removing the disk this week. Does anyone want a disk that produces DMA timeouts at a regular rate? Would it help actually solve this problem? Please let me know if you want such a beast and I will ship it to you. /Joe From lehmann at ans-netz.de Sun Nov 9 21:46:41 2008 From: lehmann at ans-netz.de (Oliver Lehmann) Date: Sun Nov 9 21:46:48 2008 Subject: libzfs: vmem.h missing In-Reply-To: <20081109092051.99e8d97c.lehmann@ans-netz.de> References: <20081109092051.99e8d97c.lehmann@ans-netz.de> Message-ID: <20081110064638.d0f20e20.lehmann@ans-netz.de> Oliver Lehmann wrote: > Hi, > > when I try to compile an actual 7 STABLE Im getting: > > why is vmem.h missing? My cvsupfile only contained src-sys... so vmem.h got probably removed, but zfs_context.h got not updated.... -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ From phk at phk.freebsd.dk Sun Nov 9 23:16:34 2008 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Sun Nov 9 23:16:42 2008 Subject: fifo log problem In-Reply-To: Your message of "Mon, 03 Nov 2008 20:43:53 EST." <200811040143.mA41hjaa029665@lava.sentex.ca> Message-ID: <73413.1226301391@critter.freebsd.dk> In message <200811040143.mA41hjaa029665@lava.sentex.ca>, Mike Tancsa writes: >I tried changing the config so that there is only the fifo log being >written to and disabled newsyslog so that syslogd is not getting a >HUP signal. The strange thing is that reading from it gives >different results?!? > >Sometimes doing >[ps0278]# fifolog_reader all.fifo | wc >>From 0 Wed Dec 31 19:00:00 1969 >To 1225760679 Mon Nov 3 20:04:39 2008 >Read from 1d800 > 59 413 3068 >0[ps0278]# > >and a exactly for 1min it will show the correct results > >0[ps0278]# fifolog_reader all.fifo | wc >>From 0 Wed Dec 31 19:00:00 1969 >To 1225760538 Mon Nov 3 20:02:18 2008 >Read from 0 > 10765 75995 556816 >0[ps0278]# I could fear that you have two fifologs running at the same time, possibly as a result of syslogd doing something strange on sighup... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From ivoras at freebsd.org Mon Nov 10 02:37:46 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Nov 10 02:37:54 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: <491726F4.4040808@lozenetz.org> References: <491726F4.4040808@lozenetz.org> Message-ID: Anton - Valqk wrote: > You can try taking look to lighttpd status and fcgi processes status > like this: > server.modules += ( "mod_status" ) > status.status-url = "/server-status" > status.statistics-url = "/sstatus1" > > status.statistics-url gives info for each fastcgi like this: > > fastcgi.active-requests: 0 > fastcgi.backend.backend1.0.connected: 12493970 > fastcgi.backend.backend1.0.died: 0 > fastcgi.backend.backend1.0.disabled: 0 > fastcgi.backend.backend1.0.load: 0 > fastcgi.backend.backend1.0.overloaded: 0 > fastcgi.backend.backend1.load: 1 > fastcgi.requests: 19479062 > > > etc... read what each means on lighttpd site... > pls tell what caused this, it'd be very interesting to me! Yes, though I don't use lighttpd right now, I plan to one day so I'm also interested in the cause and solution for this problem. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081110/7c6021b2/signature.pgp From victor at bsdes.net Mon Nov 10 04:45:56 2008 From: victor at bsdes.net (Victor Balada Diaz) Date: Mon Nov 10 04:46:05 2008 Subject: Interrupt routing issues in FreeBSD 7.1-BETA2 Message-ID: <20081110124553.GL2327@alf.bsdes.net> Hello, last month i reported a problem with interrupt storms in re(4). You can find that report here: http://lists.freebsd.org/pipermail/freebsd-stable/2008-October/046075.html After that, i filled a bug report: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/128287 Remko Lodder (CC'ed) suggested me that i should disable USB code in kernel if i was not using it. As i wasn't using it, i tried disabling it and so far, the problem was solved. After a few days in production with the new kernel, GENERIC without USB and firewrire i've found other interrupt problems: vmstat -i: oro# vmstat -i interrupt total rate irq1: atkbd0 2 0 irq9: acpi0 1 0 irq19: re0 39786514 42 irq22: atapci0 8050818241 8515 cpu0: timer 1890881479 1999 cpu1: timer 1890881390 1999 Total 11872367627 12557 This time the problem is not in IRQ19 (re), but on IRQ 22 atapci0. This looked weird to me, as this interrupt is not shared, so i searched what was the model and MFG of this motherboard and found with dmidecode this: Base Board Information Manufacturer: MICRO-STAR INTERANTIONAL CO.,LTD Product Name: MS-7368 Version: 1.0 Looking at freebsd archives for this model i've found other people had the same problem: http://lists.freebsd.org/pipermail/freebsd-current/2007-November/080525.html http://lists.freebsd.org/pipermail/freebsd-questions/2008-June/176794.html Related problems: http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038571.html http://lists.freebsd.org/pipermail/freebsd-current/2008-February/083584.html Does anyone know what could be the cause of this or how can i fix it? If there is any more information i could provide to help solving this, just ask for it and i'll do my best to get it. Thanks in advance. Regards. -- La prueba m?s fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. From fernan.aguero at gmail.com Mon Nov 10 05:22:00 2008 From: fernan.aguero at gmail.com (Fernan Aguero) Date: Mon Nov 10 06:47:57 2008 Subject: [fernan@iib.unsam.edu.ar: Re: [FreeBSD] Fix for ServerWorks HT1000 in upcoming 7.1?] In-Reply-To: <20081110120134.GB14740@iib.unsam.edu.ar> References: <20081110120134.GB14740@iib.unsam.edu.ar> Message-ID: <520894aa0811100452p5f7b5ebdx9f3c9929a452ef10@mail.gmail.com> On Mon, Nov 10, 2008 at 10:01 AM, Fernan Aguero wrote: > > Date: Wed, 29 Oct 2008 14:05:22 -0300 > From: Fernan Aguero > To: Kirk Strauser > Cc: freebsd-stable@freebsd.org, d@delphij.net, > John Baldwin , re@freebsd.org > Subject: Re: [FreeBSD] Fix for ServerWorks HT1000 in upcoming 7.1? > >> On Tuesday 07 October 2008 17:10:49 Xin LI wrote: >> > Did anyone who can trigger the data corruption has tried John's patch >> > and let us know if it worked? >> > >> > Cheers, >> >> I can confirm that it works on my PowerEdge SC1435. With both controllers >> running in SATA150 mode, I have an uptime of 101 days with moderately heavy >> load. >> -- >> Kirk Strauser > > Same here. The ata_ht1000.patch referenced in this thread > works, in my case at least with 1 controller running in > SATA150 mode (I have only 1 disk). > > However, the recent 7.1-BETA2 does not work for me, contrary > to reports saying that this BETA contains the patch. Looking > at the sources, it seems evident that the patch applied is > not identical to the ata_ht1000.patch in this thread. > > This patch: > http://people.freebsd.org/~jhb/patches/ata_ht1000.patch > is the only requirement to turn a broken 7.1-BETA1 into a > working 7.1 for a Dell PowerEdge SC1435. > > The recent 7.1-BETA2, does not work! > > Fernan > Hi, can anyone confirm if 7.1-BETA2 works for this platform? It does not for me. Will there be another BETA/RC before 7.1 is rolled out? Thanks, -- fernan From sclark46 at earthlink.net Mon Nov 10 08:01:04 2008 From: sclark46 at earthlink.net (Stephen Clark) Date: Mon Nov 10 08:01:11 2008 Subject: du and df don't agree Message-ID: <49185ABC.6080004@earthlink.net> Why would du show 630k used by /tmp while df show 161M used by /tmp? # du -sh /tmp 630K /tmp # df -h Filesystem Size Used Avail Capacity Mounted on /dev/ad0s1a 94M 41M 45M 47% / devfs 1.0K 1.0K 0B 100% /dev /dev/ad0s1e 193M 161M 17M 91% /tmp ... remaining fs info removed # uname -a FreeBSD 6.1-STABLE FreeBSD 6.1-STABLE I have run fstat /tmp and can't find any files that are using the space that df is claiming as being used. -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) From erwan at rail.eu.org Mon Nov 10 08:22:06 2008 From: erwan at rail.eu.org (Erwan David) Date: Mon Nov 10 08:22:13 2008 Subject: du and df don't agree In-Reply-To: <49185ABC.6080004@earthlink.net> References: <49185ABC.6080004@earthlink.net> Message-ID: <20081110160315.GH9074@rail.eu.org> Le Mon 10/11/2008, Stephen Clark disait > Why would du show 630k used by /tmp while df show 161M used > by /tmp? > > > # du -sh /tmp > 630K /tmp > # df -h > Filesystem Size Used Avail Capacity Mounted on > /dev/ad0s1a 94M 41M 45M 47% / > devfs 1.0K 1.0K 0B 100% /dev > /dev/ad0s1e 193M 161M 17M 91% /tmp > ... remaining fs info removed > > # uname -a > FreeBSD 6.1-STABLE FreeBSD 6.1-STABLE > > I have run fstat /tmp and can't find any files that are using > the space that df is claiming as being used. Because this space is used by removed, but open files. Try lsof +L1 to see them. -- Erwan From jille at quis.cx Mon Nov 10 08:26:07 2008 From: jille at quis.cx (Jille Timmermans) Date: Mon Nov 10 08:26:14 2008 Subject: du and df don't agree In-Reply-To: <49185ABC.6080004@earthlink.net> References: <49185ABC.6080004@earthlink.net> Message-ID: <49185CDB.70702@quis.cx> Probably some file descriptor to an unlinked file is still open. The space on disk will be freed when the last descriptor is closed. and because there is no file linked to the data, du can't see it. (iirc MySQL does this in some situaties when using temp-tables or rebuilding tables) -- Jille Stephen Clark wrote: > Why would du show 630k used by /tmp while df show 161M used > by /tmp? > > > # du -sh /tmp > 630K /tmp > # df -h > Filesystem Size Used Avail Capacity Mounted on > /dev/ad0s1a 94M 41M 45M 47% / > devfs 1.0K 1.0K 0B 100% /dev > /dev/ad0s1e 193M 161M 17M 91% /tmp > ... remaining fs info removed > > # uname -a > FreeBSD 6.1-STABLE FreeBSD 6.1-STABLE > > I have run fstat /tmp and can't find any files that are using > the space that df is claiming as being used. > From sclark46 at earthlink.net Mon Nov 10 08:27:25 2008 From: sclark46 at earthlink.net (Stephen Clark) Date: Mon Nov 10 08:27:33 2008 Subject: du and df don't agree In-Reply-To: <49185CDB.70702@quis.cx> References: <49185ABC.6080004@earthlink.net> <49185CDB.70702@quis.cx> Message-ID: <491860E9.2020400@earthlink.net> Jille Timmermans wrote: > Probably some file descriptor to an unlinked file is still open. > The space on disk will be freed when the last descriptor is closed. > and because there is no file linked to the data, du can't see it. > > (iirc MySQL does this in some situaties when using temp-tables or > rebuilding tables) > > -- Jille > > Stephen Clark wrote: >> Why would du show 630k used by /tmp while df show 161M used >> by /tmp? >> >> >> # du -sh /tmp >> 630K /tmp >> # df -h >> Filesystem Size Used Avail Capacity Mounted on >> /dev/ad0s1a 94M 41M 45M 47% / >> devfs 1.0K 1.0K 0B 100% /dev >> /dev/ad0s1e 193M 161M 17M 91% /tmp >> ... remaining fs info removed >> >> # uname -a >> FreeBSD 6.1-STABLE FreeBSD 6.1-STABLE >> >> I have run fstat /tmp and can't find any files that are using >> the space that df is claiming as being used. >> > Thanks, I managed to find the offending process with fstat and kill it. Now: Filesystem Size Used Avail Capacity Mounted on /dev/ad0s1e 193M 666K 177M 0% /tmp $ du -sh /tmp 682K /tmp Regards, Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) From eugen at kuzbass.ru Mon Nov 10 08:51:57 2008 From: eugen at kuzbass.ru (Eugene Grosbein) Date: Mon Nov 10 08:52:04 2008 Subject: du and df don't agree In-Reply-To: <49185ABC.6080004@earthlink.net> References: <49185ABC.6080004@earthlink.net> Message-ID: <20081110162111.GA26951@svzserv.kemerovo.su> On Mon, Nov 10, 2008 at 11:01:00AM -0500, Stephen Clark wrote: > Why would du show 630k used by /tmp while df show 161M used > by /tmp? > > I have run fstat /tmp and can't find any files that are using > the space that df is claiming as being used. You need lsof +aL1 /tmp to see an answer. Eugene Grosbein From lists at reiteration.net Mon Nov 10 11:21:45 2008 From: lists at reiteration.net (John) Date: Mon Nov 10 11:21:52 2008 Subject: OpenSSH error: error: key_read: uudecode failed FreeBSD_7 stable Message-ID: <4918757E.7030904@reiteration.net> Hello, I installed FreeBSD from 7.0-release and brought up to stable a few days ago: lentil# uname -a FreeBSD lentil.growveg.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Mon Nov 10 09:21:06 GMT 2008 root@lentil.growveg.org:/ext0/system/src/sys/i386/compile/LENTIL i386 today I recompiled the kernel because I had stripped out the ipv6 stuff - however some apps seem to require it. That all went well. Afterwards, I set up the logging. I'm noticing the following error from the console: Nov 10 17:19:43 lentil sshd[1956]: error: key_read: uudecode AAAAB3NzaC1kc3MAAACBAKLAJVTvYOqi5bVYEyahzSTrb0L4JbLGhtiNGEXQr/pqWcTxBoicc1/EJt0MCirV3+A63smW4e7sfeZQhcKkwvL6MrSqc2wKSmJfPkV5GY/zwFiRLcsuNRWtd9Zgg8/mWhp5fZlZ6M81Cz\n failed Nov 10 17:19:43 lentil kernel: Nov 10 17:19:43 lentil sshd[1956]: error: key_read: uudecode AAAAB3NzaC1kc3MAAACBAKLAJVTvYOqi5bVYEyahzSTrb0L4JbLGhtiNGEXQr/pqWcTxBoicc1/EJt0MCirV3+A63smW4e7sfeZQhcKkwvL6MrSqc2wKSmJfPkV5GY/zwFiRLcsuNRWtd9Zgg8/mWhp5fZlZ6M81Cz\n failed Nov 10 17:19:43 lentil sshd[1956]: error: key_read: uudecode AAAAB3NzaC1kc3MAAACBAKLAJVTvYOqi5bVYEyahzSTrb0L4JbLGhtiNGEXQr/pqWcTxBoicc1/EJt0MCirV3+A63smW4e7sfeZQhcKkwvL6MrSqc2wKSmJfPkV5GY/zwFiRLcsuNRWtd9Zgg8/mWhp5fZlZ6M81Cz\n failed Nov 10 17:19:43 lentil sshd[1956]: Accepted publickey for john from 62.49.247.174 port 55327 ssh2 Nov 10 17:19:43 lentil kernel: Nov 10 17:19:43 lentil sshd[1956]: error: key_read: uudecode AAAAB3NzaC1kc3MAAACBAKLAJVTvYOqi5bVYEyahzSTrb0L4JbLGhtiNGEXQr/pqWcTxBoicc1/EJt0MCirV3+A63smW4e7sfeZQhcKkwvL6MrSqc2wKSmJfPkV5GY/zwFiRLcsuNRWtd9Zgg8/mWhp5fZlZ6M81Cz\n failed It doesn't seem to stop sshd from working, but it's annoying nonetheless. I did a quick google, and found someone describing the same problem at https://bugzilla.mindrot.org/show_bug.cgi?id=1525 but they say this was fixed in OpenSSH 4.7. - that URL is the latest report of the problem. I'm running a much later version: sshd: OpenSSH_5.1p1 FreeBSD-20080901, OpenSSL 0.9.8e 23 Feb 2007 Can anyone help me with a fix/workaround? thanks -- John From delphij at delphij.net Mon Nov 10 14:47:51 2008 From: delphij at delphij.net (Xin LI) Date: Mon Nov 10 14:47:58 2008 Subject: HEADSUP bce(4) 7-STABLE users [Fwd: svn commit: r184826 - in stable/7/sys: . dev/bce] Message-ID: <4918BA09.5020000@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, FYI. I have just committed a MFC of most of recent bce(4) improvements. These changes were in tree for some months, while on the other hand it's a big chunk of changes. Feedbacks appreciated. - -------- Original Message -------- Subject: svn commit: r184826 - in stable/7/sys: . dev/bce Date: Mon, 10 Nov 2008 22:40:16 +0000 (UTC) From: Xin LI To: src-committers@FreeBSD.ORG, svn-src-all@FreeBSD.ORG, svn-src-stable@FreeBSD.ORG, svn-src-stable-7@FreeBSD.ORG Author: delphij Date: Mon Nov 10 22:40:16 2008 New Revision: 184826 URL: http://svn.freebsd.org/changeset/base/184826 Log: Merge the following bce(4) changes: r176448,178132,178853,179436,179695,179771,182293 r176448 (davidch) - Added loose RX MTU functionality to allow frames larger than 1500 bytes to be accepted even though the interface MTU is set to 1500. - Implemented new TCP header splitting/jumbo frame support which uses two chains for receive traffic rather than the original single receive chain. - Added additional debug support code. r178132 (davidch) - Fixed a problem with the send chain consumer index which would cause TX traffic to sit in the send chain until a received packet kick started the interrupt handler. This would cause extremely slow performance when used with NFS over UDP. - Removed untested polling code. - Updated copyright year in the file header. - Removed inadvertent ^M's created by DOS text editor. r178853 (scottl) The BCE chips appear to have an undocumented requirement that RX frames be aligned on an 8 byte boundary. Prior to rev 1.36 (now r176448) this wasn't a problem because mbuf clusters tend be naturally aligned. The switch to using split buffers with the first buffer being the embedded data area of the mbuf has broken this assumption, at least on i386, causing a complete failure of RX functionality. Fix this for now by using a full cluster for the first RX buffer. A more sophisticated approach could be done with the old buffer scheme to realign the m_data pointer with m_adj(), but I'm also not clear on performance benefits of this old scheme or the performance implications of adding an m_adj() call to every allocation. r179436 (jhb) Trim an extra semi-colon. r179695 (davidch) - Fixed kern/123696 by increasing firmware timeout value from 100 to 1000. - Fixed a problem on i386 architecture when using split header/jumbo frame firmware caused by hardware alignment requirements. - Added #define BCE_USE_SPLIT_HEADER to allow the feature to be enabled/disabled. Enabled by default. PR: kern/123696 r179771 (davidch) - Added support for BCM5709 and BCM5716 controllers. r182293 (davidch) - Updated support for 5716. - Added some additional code for debug builds. - Fixed a problem printing physical memory on 64bit system during debugging. - Modified some of the context memory and mailbox register names to more clearly distinguish their use. - Added memory barriers for Intel CPUs when accessing host memory data structures which are written by hardware. Approved by: re (kib) Modified: stable/7/sys/ (props changed) stable/7/sys/dev/bce/if_bce.c stable/7/sys/dev/bce/if_bcefw.h stable/7/sys/dev/bce/if_bcereg.h Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkYuggACgkQi+vbBBjt66AxbQCgvJlteJFhEsQSg5JEyxDhWmAx vxMAn13qErkFGa/hVGi8oyl/xiMSOW9Y =q8Qv -----END PGP SIGNATURE----- From gaijin.k at gmail.com Mon Nov 10 18:03:20 2008 From: gaijin.k at gmail.com (Alexandre "Sunny" Kovalenko) Date: Mon Nov 10 18:03:27 2008 Subject: sysctl debug.cpufreq.highest In-Reply-To: <20081108012734.3274F1E3055@f03.poczta.interia.pl> References: <20081108012734.3274F1E3055@f03.poczta.interia.pl> Message-ID: <1226368988.1244.3.camel@RabbitsDen> On Sat, 2008-11-08 at 02:27 +0100, vermaden wrote: > Hi, > > Currently there is possibility to set lowest speed of > cpu for scaling with cpufreq (debug.cpufreq.lowest), > it would be good to include also a option to set the > highest possible freq to use with cpufreq, some laptops > get too hot and/or consume too much power when running > on maximum power/speed of cpu. If temperature is the concern, you could override passive cooling threshold by putting something like hw.acpi.thermal.user_override=1 hw.acpi.thermal.tz1._PSV=75C into your /etc/sysctl.conf You will need to figure out which thermal zone you need to override _PSV for (in my case tz1) and what do you want to cap temperature at (in my case 75C) HTH, -- Alexandre "Sunny" Kovalenko (????????? ?????????) From mail25 at bzerk.org Tue Nov 11 00:03:28 2008 From: mail25 at bzerk.org (Ruben de Groot) Date: Tue Nov 11 00:03:36 2008 Subject: du and df don't agree In-Reply-To: <20081110162111.GA26951@svzserv.kemerovo.su> References: <49185ABC.6080004@earthlink.net> <20081110162111.GA26951@svzserv.kemerovo.su> Message-ID: <20081111080321.GA94210@ei.bzerk.org> On Mon, Nov 10, 2008 at 11:21:11PM +0700, Eugene Grosbein typed: > On Mon, Nov 10, 2008 at 11:01:00AM -0500, Stephen Clark wrote: > > > Why would du show 630k used by /tmp while df show 161M used > > by /tmp? > > > > I have run fstat /tmp and can't find any files that are using > > the space that df is claiming as being used. > > You need lsof +aL1 /tmp to see an answer. Please don't advise people to install third party apps (lsof) where base system tools (fstat) can do the job. Ruben From johan at stromnet.se Tue Nov 11 01:04:57 2008 From: johan at stromnet.se (=?ISO-8859-1?Q?Johan_Str=F6m?=) Date: Tue Nov 11 01:05:05 2008 Subject: panic in kevent Message-ID: <8255ED46-5A67-4A95-AFE7-34F6E8E2388E@stromnet.se> Hi One of my DL360G5 boxes running 7.0 had a panic this night: jb-2 ~$ uname -rsv FreeBSD 7.0-RELEASE-p4 FreeBSD 7.0-RELEASE-p4 #2: Thu Sep 4 10:49:27 CEST 2008 johan@jb-2:/usr/obj/usr/src/sys/DL360G5 The config is a GENERIC with some pf, IPSEC and ALTQ stuff enabled. jb-2 /usr/obj/usr/src/sys/DL360G5# kgdb kernel.debug /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/ libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd". Unread portion of the kernel message buffer: panic: page fault cpuid = 1 Uptime: 40d22h42m5s Physical memory: 10225 MB Dumping 867 MB: 852 836 820 804 788 772 756 740 724 708 692 676 660 644 628 612 596 580 564 548 532 516 500 484 468 452 436 420 404 388 372 356 #0 doadump () at pcpu.h:194 194 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:194 #1 0x0000000000000004 in ?? () #2 0xffffffff804bb259 in boot (howto=260) at /usr/src/sys/kern/ kern_shutdown.c:409 #3 0xffffffff804bb65d in panic (fmt=0x104
) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0xffffffff8079ec84 in trap_fatal (frame=0xffffff01b33229f0, eva=18446742984664492240) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0xffffffff8079f055 in trap_pfault (frame=0xffffffffb6337780, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0xffffffff8079f998 in trap (frame=0xffffffffb6337780) at /usr/src/ sys/amd64/amd64/trap.c:410 #7 0xffffffff8078560e in calltrap () at /usr/src/sys/amd64/amd64/ exception.S:169 #8 0xffffffff80494b0b in knlist_remove_kq (knl=0xffffff0114407748, kn=0xffffff0054f5fc30, knlislocked=0, kqislocked=0) at /usr/src/sys/kern/kern_event.c:1615 #9 0xffffffff80495f58 in kqueue_register (kq=Variable "kq" is not available. ) at /usr/src/sys/kern/kern_event.c:956 #10 0xffffffff804962f3 in kern_kevent (td=0xffffff01b33229f0, fd=Variable "fd" is not available. ) at /usr/src/sys/kern/kern_event.c:673 #11 0xffffffff80496ca5 in kevent (td=0xffffff01b33229f0, uap=0xffffffffb6337be0) at /usr/src/sys/kern/kern_event.c:594 #12 0xffffffff8079f2d7 in syscall (frame=0xffffffffb6337c70) at /usr/ src/sys/amd64/amd64/trap.c:852 #13 0xffffffff8078581b in Xfast_syscall () at /usr/src/sys/amd64/amd64/ exception.S:290 #14 0x0000000010999ccc in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) Please let me know if I can help with anything else. Is there any way to know which app caused this? I Did some googling with only one or two similar crashes as result, although the hits didn't give much.. I've never had this crash before. Thanks -- Johan From wjw at digiware.nl Tue Nov 11 01:22:48 2008 From: wjw at digiware.nl (Willem Jan Withagen) Date: Tue Nov 11 01:23:01 2008 Subject: du and df don't agree In-Reply-To: <20081111080321.GA94210@ei.bzerk.org> References: <49185ABC.6080004@earthlink.net> <20081110162111.GA26951@svzserv.kemerovo.su> <20081111080321.GA94210@ei.bzerk.org> Message-ID: <49194EE2.7060708@IMAP> Ruben de Groot wrote: > On Mon, Nov 10, 2008 at 11:21:11PM +0700, Eugene Grosbein typed: >> On Mon, Nov 10, 2008 at 11:01:00AM -0500, Stephen Clark wrote: >> >>> Why would du show 630k used by /tmp while df show 161M used >>> by /tmp? >>> >>> I have run fstat /tmp and can't find any files that are using >>> the space that df is claiming as being used. >> You need lsof +aL1 /tmp to see an answer. > > Please don't advise people to install third party apps (lsof) where > base system tools (fstat) can do the job. Why not? This is one of the ways I pick up on all kinds of nice tools I've not heard of before. And it is not like this is swamping the list with of topic questions. You might have redirected the question to questions@ OTOH: The initial question indicated that fstat did not give the info wanted. Your info does not help, because you told him to use fstat, but forgot to mention HOW. So he is in no way any wiser after your answer. --WjW From mail25 at bzerk.org Tue Nov 11 02:55:37 2008 From: mail25 at bzerk.org (Ruben de Groot) Date: Tue Nov 11 02:55:44 2008 Subject: du and df don't agree In-Reply-To: <49194EE2.7060708@IMAP> References: <49185ABC.6080004@earthlink.net> <20081110162111.GA26951@svzserv.kemerovo.su> <20081111080321.GA94210@ei.bzerk.org> <49194EE2.7060708@IMAP> Message-ID: <20081111105530.GA94707@ei.bzerk.org> On Tue, Nov 11, 2008 at 10:22:42AM +0100, Willem Jan Withagen typed: > Ruben de Groot wrote: > >On Mon, Nov 10, 2008 at 11:21:11PM +0700, Eugene Grosbein typed: > >>On Mon, Nov 10, 2008 at 11:01:00AM -0500, Stephen Clark wrote: > >> > >>>Why would du show 630k used by /tmp while df show 161M used > >>>by /tmp? > >>> > >>>I have run fstat /tmp and can't find any files that are using > >>>the space that df is claiming as being used. > >>You need lsof +aL1 /tmp to see an answer. > > > >Please don't advise people to install third party apps (lsof) where > >base system tools (fstat) can do the job. > > Why not? Because it gives the impression the base system is incomplete, which it is not, at least not in this situation. The wording "you need lsof" is plain wrong. > This is one of the ways I pick up on all kinds of nice tools I've not heard > of before. And it is not like this is swamping the list with of topic > questions. Difference of opinion; I prefer to use FreeBSD tools before falling back to 3rd party tools, no matter how nice. > You might have redirected the question to questions@ I don't see the point. Why didn't you redirect it? > OTOH > The initial question indicated that fstat did not give the info wanted. > Your info does not help, because you told him to use fstat, but forgot to > mention HOW. So he is in no way any wiser after your answer. Yes, my bad. But he allready found the answer himself (using fstat -f I guess). Ruben From wjw at digiware.nl Tue Nov 11 03:27:52 2008 From: wjw at digiware.nl (Willem Jan Withagen) Date: Tue Nov 11 03:27:59 2008 Subject: Discussing non BSD tools on Stable (Was: Re: du and df don't agree) In-Reply-To: <20081111105530.GA94707@ei.bzerk.org> References: <49185ABC.6080004@earthlink.net> <20081110162111.GA26951@svzserv.kemerovo.su> <20081111080321.GA94210@ei.bzerk.org> <49194EE2.7060708@IMAP> <20081111105530.GA94707@ei.bzerk.org> Message-ID: <49196C32.7000404@IMAP> Ruben de Groot wrote: >>>> You need lsof +aL1 /tmp to see an answer. >>> Please don't advise people to install third party apps (lsof) where >>> base system tools (fstat) can do the job. >> Why not? > > Because it gives the impression the base system is incomplete, which it is not, > at least not in this situation. The wording "you need lsof" is plain wrong. Perhaps the wording is chosen to be too strong. But >> This is one of the ways I pick up on all kinds of nice tools I've not heard >> of before. And it is not like this is swamping the list with of topic >> questions. > > Difference of opinion; I prefer to use FreeBSD tools before falling back to > 3rd party tools, no matter how nice. Shure, so do I. But then base and ports are just like one very large candy store, and there are too many tools to know about. >> You might have redirected the question to questions@ > > I don't see the point. Why didn't you redirect it? 1) You made it sound like it is an obvious question with an even more obvious answer. And normally these go to questions@ 2) The reason for this is on the Unix FAQ, not shure it has this answer with it. 3) Because I would not know the answer to the question asked, And what I should have done is change the subject with the previous posting. >> OTOH >> The initial question indicated that fstat did not give the info wanted. >> Your info does not help, because you told him to use fstat, but forgot to >> mention HOW. So he is in no way any wiser after your answer. > > Yes, my bad. But he allready found the answer himself (using fstat -f I guess). You've at least now given me a reason to look at another base tool :) FAIC this is the last we post on this, and get back to using/hacking FreeBSD. --WjW From kostikbel at gmail.com Tue Nov 11 03:28:43 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Tue Nov 11 03:28:50 2008 Subject: panic in kevent In-Reply-To: <8255ED46-5A67-4A95-AFE7-34F6E8E2388E@stromnet.se> References: <8255ED46-5A67-4A95-AFE7-34F6E8E2388E@stromnet.se> Message-ID: <20081111112835.GA47073@deviant.kiev.zoral.com.ua> On Tue, Nov 11, 2008 at 09:49:26AM +0100, Johan Str?m wrote: > Hi > One of my DL360G5 boxes running 7.0 had a panic this night: > > jb-2 ~$ uname -rsv > FreeBSD 7.0-RELEASE-p4 FreeBSD 7.0-RELEASE-p4 #2: Thu Sep 4 10:49:27 > CEST 2008 johan@jb-2:/usr/obj/usr/src/sys/DL360G5 > > The config is a GENERIC with some pf, IPSEC and ALTQ stuff enabled. > > jb-2 /usr/obj/usr/src/sys/DL360G5# kgdb kernel.debug /var/crash/vmcore.0 > [GDB will not be able to debug user-mode threads: /usr/lib/ > libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and > you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "amd64-marcel-freebsd". > > Unread portion of the kernel message buffer: > > panic: page fault > cpuid = 1 > Uptime: 40d22h42m5s > Physical memory: 10225 MB > Dumping 867 MB: 852 836 820 804 788 772 756 740 724 708 692 676 660 > 644 628 612 596 580 564 548 532 516 500 484 468 452 436 420 404 388 > 372 356 > > #0 doadump () at pcpu.h:194 > 194 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); > (kgdb) where > #0 doadump () at pcpu.h:194 > #1 0x0000000000000004 in ?? () > #2 0xffffffff804bb259 in boot (howto=260) at /usr/src/sys/kern/ > kern_shutdown.c:409 > #3 0xffffffff804bb65d in panic (fmt=0x104
bounds>) at /usr/src/sys/kern/kern_shutdown.c:563 > #4 0xffffffff8079ec84 in trap_fatal (frame=0xffffff01b33229f0, > eva=18446742984664492240) at /usr/src/sys/amd64/amd64/trap.c:724 > #5 0xffffffff8079f055 in trap_pfault (frame=0xffffffffb6337780, > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 > #6 0xffffffff8079f998 in trap (frame=0xffffffffb6337780) at /usr/src/ > sys/amd64/amd64/trap.c:410 > #7 0xffffffff8078560e in calltrap () at /usr/src/sys/amd64/amd64/ > exception.S:169 > #8 0xffffffff80494b0b in knlist_remove_kq (knl=0xffffff0114407748, > kn=0xffffff0054f5fc30, knlislocked=0, kqislocked=0) > at /usr/src/sys/kern/kern_event.c:1615 > #9 0xffffffff80495f58 in kqueue_register (kq=Variable "kq" is not > available. > ) at /usr/src/sys/kern/kern_event.c:956 > #10 0xffffffff804962f3 in kern_kevent (td=0xffffff01b33229f0, > fd=Variable "fd" is not available. > ) at /usr/src/sys/kern/kern_event.c:673 > #11 0xffffffff80496ca5 in kevent (td=0xffffff01b33229f0, > uap=0xffffffffb6337be0) at /usr/src/sys/kern/kern_event.c:594 > #12 0xffffffff8079f2d7 in syscall (frame=0xffffffffb6337c70) at /usr/ > src/sys/amd64/amd64/trap.c:852 > #13 0xffffffff8078581b in Xfast_syscall () at /usr/src/sys/amd64/amd64/ > exception.S:290 > #14 0x0000000010999ccc in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > > Please let me know if I can help with anything else. Is there any way > to know which app caused this? > I Did some googling with only one or two similar crashes as result, > although the hits didn't give much.. > I've never had this crash before. There is very high chances that the problem fixed in the 7.1. Unless it is easily reproducable in your settings, there is no easy way to confirm this. You can do "info threads" in the kgdb to overview processes on the crashed system. The thread that was on the CPU during the crash will be marked by star. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081111/9303b3e9/attachment.pgp From mh at kernel32.de Tue Nov 11 03:56:33 2008 From: mh at kernel32.de (Marian Hettwer) Date: Tue Nov 11 03:56:41 2008 Subject: du and df don't agree In-Reply-To: <20081111105530.GA94707@ei.bzerk.org> References: <20081111105530.GA94707@ei.bzerk.org> Message-ID: <91810ad3ecdde97481e5806afd5a5dc7@localhost> On Tue, 11 Nov 2008 11:55:30 +0100, Ruben de Groot wrote: > On Tue, Nov 11, 2008 at 10:22:42AM +0100, Willem Jan Withagen typed: >> Ruben de Groot wrote: >> >On Mon, Nov 10, 2008 at 11:21:11PM +0700, Eugene Grosbein typed: >> >>On Mon, Nov 10, 2008 at 11:01:00AM -0500, Stephen Clark wrote: >> >> >> >>>Why would du show 630k used by /tmp while df show 161M used >> >>>by /tmp? >> >>> >> >>>I have run fstat /tmp and can't find any files that are using >> >>>the space that df is claiming as being used. >> >>You need lsof +aL1 /tmp to see an answer. >> > >> >Please don't advise people to install third party apps (lsof) where >> >base system tools (fstat) can do the job. >> >> Why not? > > Because it gives the impression the base system is incomplete, which it is > not, > at least not in this situation. The wording "you need lsof" is plain > wrong. > What about proposing both? As in use fstat from BASE or use lsof from ports. IMO it's good to know that there are several tools which solves your problem. As an example from real world. I love using "sockstat -4" on FreeBSD, but I'm annoyed that it doesn't exist in OpenBSD and it doesn't exist in Debian either. So I'm used to use "netstat -tulpen" on Debian, but that won't work on FreeBSD. Anyway, since I know both ways, I find my way in both systems. Summary: Good to know alternatives. ./Marian From avg at icyb.net.ua Tue Nov 11 05:00:05 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Tue Nov 11 05:00:14 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <4912E462.4090608@icyb.net.ua> References: <4912E462.4090608@icyb.net.ua> Message-ID: <491981D0.7060100@icyb.net.ua> on 06/11/2008 14:34 Andriy Gapon said the following: > I have a quite strange problem. > This is with 7-BETA amd64. > All of USB is out of kernel and is loaded via modules. > BIOS has "Legacy USB" enabled. > I have only a USB keyboard, no PS/2 port. > > The keyboard works file in BIOS and for selecting boot device in boot0 > menu. It also works in loader menu. If in the menu I select to go to > loader prompt then it works for about 5 seconds and then "dies" - no > reaction to key presses, no led change, nothing. > I haven't actually verified if the keyboard would still work if I stayed > in loader menu for longer than ~10 seconds. > > This doesn't happen if USB is built into kernel. > > Weird... I did more experimentation and the behavior seems to be quite random - sometimes keyboard works ok for long time in all places, sometimes it stops working after some period of time, sometimes it doesn't work from the start and couple of times I experienced boot process going astray. Not sure what stage that was, there were endless messages spewed on the screen very fast, I couldn't read them. This leads me to the following "crazy" question - is it possible that our boot chain corrupts some vital BIOS memory? I think loader would be a primary suspect. I am not sure of anything, but a wild guess is that RAM where BIOS stores some USB-related stuff gets corrupted. Maybe it's overwritten when kernel and modules are loaded... -- Andriy Gapon From avg at icyb.net.ua Tue Nov 11 05:14:08 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Tue Nov 11 05:14:21 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <491586B9.2020303@vwsoft.com> References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> Message-ID: <4919851B.7050800@icyb.net.ua> on 08/11/2008 14:31 Volker said the following: > Andriy, > > On 12/23/-58 20:59, Andriy Gapon wrote: >> I have a quite strange problem. >> This is with 7-BETA amd64. > > Did it work with earlier versions? Can't say, this is a new machine, FreeBSD took its virginity :-) >> All of USB is out of kernel and is loaded via modules. >> BIOS has "Legacy USB" enabled. >> I have only a USB keyboard, no PS/2 port. > > Can you check BIOS settings for EHCI handover? No such settings. > If the BIOS does not have handover enabled, it may disable legacy > support after a timeout, which is often bad. IMO this is the same with > booting off USB drives but every BIOS handles that different. This doesn't seem to be the case. The behavior is quite random, sometimes I can work at loader prompt for may minutes, sometimes keyboard is dead after a few seconds. Also, I think USB keyboard is handled by UHCI, not EHCI in my case, but I am not sure if this matters. My guess is that Legacy support should work until OS explicitly takes over by using special procedure (this should be done for UHCI as well). BTW, it seems that our UHCI take-over code is far more simple than what MS described here: http://www.microsoft.com/whdc/archive/usbhost.mspx#EQHAC Anyway, this happens after loader is done. >> The keyboard works file in BIOS and for selecting boot device in boot0 >> menu. It also works in loader menu. If in the menu I select to go to >> loader prompt then it works for about 5 seconds and then "dies" - no >> reaction to key presses, no led change, nothing. >> I haven't actually verified if the keyboard would still work if I stayed >> in loader menu for longer than ~10 seconds. >> >> This doesn't happen if USB is built into kernel. > > That sound strange. I have no idea why that might work (or I'm totally > wrong with my handover theory). I was incorrect about the above, I have already seen it happening both ways. >> Weird... > > Yes, sounds like or it's probably easily explainable ;) -- Andriy Gapon From ken73.chen at gmail.com Tue Nov 11 06:16:41 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Tue Nov 11 06:16:48 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: <4917285C.2030702@lozenetz.org> References: <20081107142137.GA7051@icarus.home.lan> <4917285C.2030702@lozenetz.org> Message-ID: I think the parent php-cgi are very health. I have tried: There are total 49 php-cgi processes are running or frozen, the '1 wait' is parent . web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | uniq -c | sort -n 1 biowr 1 wait 15 sbwait 32 accept Kill one of frozen php-cgi processes. web4# kill -9 61392 Check again the amount of php-cgi processes, there are still 49 php-cgi procerss. web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | uniq -c | sort -n 1 biord 1 bo_wwa 1 wait 4 - 17 sbwait 25 accept 2008/11/10 Anton - Valqk > Oh, just saw that, this could be caused by dead parent php-cgi processes > (just a guess). > I used to run lighttpd with span-fcgi executable and it happens very > often to have dead parents (of php-cgi childs) that must be killed by > killall php-cgi (eg. restart _ALL_ php-cgi processes, pretty stupid!!! > but if you have dead parent you can't know which childs to kill)... > If you run your php-cgi processes just from the lighttpd(and lighttpd > manages php-cgi processes) try running it with fcgi-spawn and write a > script to check parents of the php-cgi backends and you'll see if that's > the cause of having 'hang' phps :( > > pls tell me what is it. I'm interested! > > cheers, > valqk. > Ken Chen wrote: > > Hi Jeremy, > > > > A health FastCGI process have a lifetime, so the PIDs of all php-cgi > > processes should in a short range. > > > > There are some 'php-cgi' fall in 'sbwait' state, and stay there forever. > The > > frozen 'php-cgi' can't accept new request, so never retire. > > > > Please forgive my poor English. > > > > 2008/11/7 Jeremy Chadwick > > > > > >> On Fri, Nov 07, 2008 at 07:29:37PM +0800, Ken Chen wrote: > >> > >>> Hello, > >>> > >>> I have 4 web servers with lighttpd to serve one web site with DNS load > >>> sharing. On the 2 SMP-enable web servers, there will be many php-cgi > >>> > >> frozen > >> > >>> in 'sbwait' state every day. It means the php-cgi stay in 'sbwait' > state, > >>> and never be back to 'accept' or other state. If I restart them, there > >>> > >> will > >> > >>> be frozen php-cgi appear some hours later. > >>> > >>> There is no problem on the other single CPU web servers which running > >>> > >> same > >> > >>> php scripts and same configuration and version of PHP. > >>> > >>> Why and any solution? > >>> > >> I'm not understanding what the problem is (and I've seen the output you > >> provided later in the thread). Are you stating the problem is that you > >> see many php-cgi processes? Or are you worried they're not doing > >> anything? Does the website function, lock up, or anything like that? > >> If not, what's the issue? :-) > >> > >> -- > >> | Jeremy Chadwick jdc at parodius.com | > >> | Parodius Networking http://www.parodius.com/ | > >> | UNIX Systems Administrator Mountain View, CA, USA | > >> | Making life hard for others since 1977. PGP: 4BD6C0CB | > >> > >> > >> > > _______________________________________________ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org > " > > > > > > From ken73.chen at gmail.com Tue Nov 11 06:36:38 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Tue Nov 11 06:36:45 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: <491726F4.4040808@lozenetz.org> References: <491726F4.4040808@lozenetz.org> Message-ID: The report from lighttpd looks fine: cache.cached-itmes: 98293 cache.hitrate(%): 96 cache.memory-inuse(KB): 6143 fastcgi.active-requests: 16 fastcgi.backend.0.0.connected: 419008 fastcgi.backend.0.0.died: 0 fastcgi.backend.0.0.disabled: 0 fastcgi.backend.0.0.load: 16 fastcgi.backend.0.0.overloaded: 0 fastcgi.backend.0.load: 17 fastcgi.requests: 419008 But at this moment: web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | uniq -c | sort -n 1 biord 1 wait 2 - 16 sbwait 29 accept web4# ps alx | grep php-cgi | grep -v grep | grep sbwait 65534 61392 61384 1 4 -15 182328 69312 sbwait I< ?? 0:53.39 /usr/local/bin/php-cgi 65534 61399 61384 0 4 -15 182328 71112 sbwait I< ?? 1:09.60 /usr/local/bin/php-cgi 65534 61409 61384 0 4 -15 182328 72812 sbwait I< ?? 1:39.40 /usr/local/bin/php-cgi 65534 61411 61384 0 4 -15 183352 74536 sbwait I< ?? 1:49.08 /usr/local/bin/php-cgi 65534 61412 61384 0 4 -15 183352 74508 sbwait I< ?? 1:33.31 /usr/local/bin/php-cgi 65534 61414 61384 0 4 -15 182328 62860 sbwait I< ?? 0:28.81 /usr/local/bin/php-cgi 65534 61418 61384 0 4 -15 182328 71448 sbwait I< ?? 1:17.56 /usr/local/bin/php-cgi 65534 61426 61384 0 4 -15 183352 60456 sbwait I< ?? 0:22.16 /usr/local/bin/php-cgi 65534 71529 61384 0 4 -15 182328 74144 sbwait I< ?? 0:36.87 /usr/local/bin/php-cgi 65534 71555 61384 0 4 -15 182328 72820 sbwait I< ?? 0:19.19 /usr/local/bin/php-cgi 65534 71556 61384 0 4 -15 182328 74452 sbwait I< ?? 0:38.27 /usr/local/bin/php-cgi 65534 71590 61384 0 4 -15 182328 76828 sbwait I< ?? 0:57.42 /usr/local/bin/php-cgi 65534 71594 61384 0 4 -15 182328 75576 sbwait I< ?? 0:46.50 /usr/local/bin/php-cgi 65534 71595 61384 0 4 -15 182328 84048 sbwait I< ?? 1:52.15 /usr/local/bin/php-cgi 65534 77285 61384 0 4 -15 182328 88280 sbwait S< ?? 0:15.22 /usr/local/bin/php-cgi 65534 77288 61384 3 4 -15 182328 88808 sbwait S< ?? 0:14.43 /usr/local/bin/php-cgi 65534 77317 61384 0 4 -15 182328 88912 sbwait S< ?? 0:12.79 /usr/local/bin/php-cgi 65534 77323 61384 0 4 -15 182328 89140 sbwait S< ?? 0:13.51 /usr/local/bin/php-cgi 65534 77359 61384 6 4 -15 182328 88200 sbwait S< ?? 0:13.04 /usr/local/bin/php-cgi 65534 77372 61384 2 4 -15 182328 89200 sbwait S< ?? 0:12.16 /usr/local/bin/php-cgi 65534 77392 61384 1 4 -15 181304 87200 sbwait S< ?? 0:11.02 /usr/local/bin/php-cgi 65534 77401 61384 1 4 -15 182328 88800 sbwait S< ?? 0:12.49 /usr/local/bin/php-cgi The PID of php-cgi which are less than 71595 are frozen. 2008/11/10 Anton - Valqk > You can try taking look to lighttpd status and fcgi processes status > like this: > server.modules += ( "mod_status" ) > status.status-url = "/server-status" > status.statistics-url = "/sstatus1" > > status.statistics-url gives info for each fastcgi like this: > > fastcgi.active-requests: 0 > fastcgi.backend.backend1.0.connected: 12493970 > fastcgi.backend.backend1.0.died: 0 > fastcgi.backend.backend1.0.disabled: 0 > fastcgi.backend.backend1.0.load: 0 > fastcgi.backend.backend1.0.overloaded: 0 > fastcgi.backend.backend1.load: 1 > fastcgi.requests: 19479062 > > > etc... read what each means on lighttpd site... > pls tell what caused this, it'd be very interesting to me! > > cheers, > valqk. > > Ken Chen wrote: > > I capture something. > > > > Please check the PID 57776. It's CPU time never change since my previous > > mail here. > > > > web4# ps alx | grep php-cgi | grep -v grep | grep sbwait > > 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 > > /usr/local/bin/php-cgi > > 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 > > /usr/local/bin/php-cgi > > 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 > > /usr/local/bin/php-cgi > > 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 > > /usr/local/bin/php-cgi > > 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 > > /usr/local/bin/php-cgi > > 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 > > /usr/local/bin/php-cgi > > 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 > > /usr/local/bin/php-cgi > > 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 > > /usr/local/bin/php-cgi > > 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 > > /usr/local/bin/php-cgi > > 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 > > /usr/local/bin/php-cgi > > 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 > > /usr/local/bin/php-cgi > > 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 > > /usr/local/bin/php-cgi > > 65534 66007 47240 0 4 0 183352 90960 sbwait I ?? 1:16.81 > > /usr/local/bin/php-cgi > > 65534 66014 47240 5 4 0 182328 92748 sbwait S ?? 1:41.23 > > /usr/local/bin/php-cgi > > 65534 66038 47240 1 4 0 182328 91900 sbwait I ?? 1:38.04 > > /usr/local/bin/php-cgi > > 65534 66060 47240 0 4 0 182328 90048 sbwait I ?? 1:15.46 > > /usr/local/bin/php-cgi > > 65534 66078 47240 3 4 0 182328 92224 sbwait S ?? 1:39.66 > > /usr/local/bin/php-cgi > > web4# top -b > > last pid: 70768; load averages: 1.62, 1.65, 1.43 up 4+15:56:06 > > 22:53:48 > > 85 processes: 1 running, 84 sleeping > > > > Mem: 492M Active, 1204M Inact, 218M Wired, 60M Cache, 112M Buf, 27M Free > > Swap: 2019M Total, 20K Used, 2019M Free > > > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > 69544 nobody 1 8 0 203M 38500K nanslp 1 6:31 11.33% php > > 47290 nobody 1 4 0 101M 98M kqread 1 30:42 2.98% > lighttpd > > 66526 nobody 1 4 0 178M 92796K accept 1 1:40 1.12% php-cgi > > 66077 nobody 1 4 0 178M 92512K accept 0 1:49 1.07% php-cgi > > 65921 nobody 1 4 0 178M 92696K accept 0 1:43 0.98% php-cgi > > 65968 nobody 1 4 0 178M 92484K accept 0 1:43 0.93% php-cgi > > 66017 nobody 1 4 0 178M 92444K accept 0 1:50 0.88% php-cgi > > 65979 nobody 1 4 0 178M 92676K accept 1 1:44 0.88% php-cgi > > 66424 nobody 1 4 0 178M 92928K accept 1 1:36 0.88% php-cgi > > 65938 nobody 1 4 0 178M 92336K accept 1 1:52 0.73% php-cgi > > 65951 nobody 1 4 0 178M 92704K accept 0 1:48 0.73% php-cgi > > 66016 nobody 1 4 0 178M 92232K accept 1 1:41 0.73% php-cgi > > 65950 nobody 1 4 0 178M 93192K accept 0 1:51 0.68% php-cgi > > 65999 nobody 1 4 0 178M 92940K accept 1 1:46 0.63% php-cgi > > 66008 nobody 1 4 0 178M 93000K accept 1 1:46 0.63% php-cgi > > 69286 nobody 1 4 0 178M 92208K accept 1 0:37 0.63% php-cgi > > 47289 nobody 1 4 0 73400K 70640K kqread 1 12:02 0.59% > lighttpd > > 65980 nobody 1 4 0 178M 93156K accept 1 1:51 0.59% php-cgi > > 2008/11/7 Ivan Voras > > > > > >> Ken Chen wrote: > >> > >>> Oh.. sorry, I forgot to provide the information of my environment. > >>> > >>> web4# php-cgi -v > >>> PHP 5.2.6 (cgi-fcgi) (built: Nov 2 2008 11:16:30) > >>> Copyright (c) 1997-2008 The PHP Group > >>> Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies > >>> with XCache v1.2.2, Copyright (c) 2005-2007, by mOo > >>> web4# /usr/local/lighttpd/sbin/lighttpd -v > >>> lighttpd-1.4.19 - a light and fast webserver > >>> Build-Date: Sep 1 2008 16:58:51 > >>> web4# uname -a > >>> FreeBSD web4.xxxx.com 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #11: Mon > Nov > >>> > >> 3 > >> > >>> 01:10:36 CST 2008 root@web4.xxxx.com:/usr/obj/usr/src/sys/WEB4 > i386 > >>> web4# ps alx | grep php-cgi | grep -v grep | grep sbwait > >>> 65534 57776 47240 0 4 0 182328 84984 sbwait I ?? 2:02.12 > >>> /usr/local/bin/php-cgi > >>> 65534 57801 47240 0 4 0 182328 82408 sbwait I ?? 0:19.97 > >>> /usr/local/bin/php-cgi > >>> 65534 57809 47240 0 4 0 182328 84096 sbwait I ?? 1:12.03 > >>> /usr/local/bin/php-cgi > >>> 65534 57823 47240 0 4 0 182328 84492 sbwait I ?? 2:04.21 > >>> /usr/local/bin/php-cgi > >>> 65534 57833 47240 0 4 0 183352 83316 sbwait I ?? 0:28.62 > >>> /usr/local/bin/php-cgi > >>> 65534 57866 47240 0 4 0 182328 79952 sbwait I ?? 0:05.92 > >>> /usr/local/bin/php-cgi > >>> 65534 57870 47240 0 4 0 182328 83184 sbwait I ?? 0:56.83 > >>> /usr/local/bin/php-cgi > >>> 65534 57871 47240 0 4 0 182328 83388 sbwait I ?? 0:54.96 > >>> /usr/local/bin/php-cgi > >>> 65534 57891 47240 0 4 0 182328 84436 sbwait I ?? 1:58.32 > >>> /usr/local/bin/php-cgi > >>> 65534 57925 47240 0 4 0 182328 84380 sbwait I ?? 2:03.53 > >>> /usr/local/bin/php-cgi > >>> 65534 65944 47240 0 4 0 182328 84184 sbwait I ?? 0:39.97 > >>> /usr/local/bin/php-cgi > >>> 65534 65952 47240 0 4 0 182328 84408 sbwait I ?? 0:21.37 > >>> /usr/local/bin/php-cgi > >>> > >> This does seem a bit unusual, but seeing that your execution times are > >> not null it might that the PHP servers are actually doing some useful > >> work. You should have a mixture of various states in PHP - do they show > >> up in top? > >> > >> My own example is: > >> > >> last pid: 77421; load averages: 2.82, 2.59, 2.13 > >> up > >> 55+16:58:49 15:48:16 > >> 209 processes: 2 running, 206 sleeping, 1 zombie > >> CPU: 49.8% user, 0.0% nice, 2.8% system, 0.0% interrupt, 47.4% idle > >> Mem: 1493M Active, 1583M Inact, 278M Wired, 139M Cache, 112M Buf, 505M > Free > >> Swap: 4500M Total, 416M Used, 4084M Free, 9% Inuse > >> > >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > >> 75863 www 1 4 0 162M 50020K sbwait 3 2:54 36.77% > php-cgi > >> 76830 www 1 103 0 156M 41556K CPU2 3 1:28 36.77% > php-cgi > >> 76834 www 1 4 0 163M 56628K sbwait 0 2:23 33.59% > php-cgi > >> 76019 www 1 4 0 150M 38948K accept 3 3:12 20.56% > php-cgi > >> 76825 www 1 4 0 158M 42912K accept 2 1:21 18.16% > php-cgi > >> 76846 www 1 4 0 162M 42600K sbwait 1 1:07 14.36% > php-cgi > >> 76835 www 1 4 0 151M 39948K accept 2 1:28 12.60% > php-cgi > >> 76829 www 1 4 0 150M 36564K sbwait 2 1:46 2.98% > php-cgi > >> > >> This is unusually high load, a spike, for this server but it has many > >> cores and it's stable. It's also running 7.1-PRERELEASE. > >> > >> > >> > > _______________________________________________ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org > " > > > > > > From eksffa at freebsdbrasil.com.br Tue Nov 11 07:07:04 2008 From: eksffa at freebsdbrasil.com.br (Patrick Tracanelli) Date: Tue Nov 11 07:07:11 2008 Subject: MFC Request Message-ID: <49199F76.5010800@freebsdbrasil.com.br> Is it possible to have traceroute MFC'd for 7.1? I would like to have -a and -A switchs (ASN Path mapping) available. Thank you :) -- Patrick Tracanelli From koitsu at FreeBSD.org Tue Nov 11 08:05:28 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Tue Nov 11 08:05:35 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: References: <20081107142137.GA7051@icarus.home.lan> <4917285C.2030702@lozenetz.org> Message-ID: <20081111160526.GA1608@icarus.home.lan> On Tue, Nov 11, 2008 at 10:09:38PM +0800, Ken Chen wrote: > I think the parent php-cgi are very health. I have tried: > > There are total 49 php-cgi processes are running or frozen, the '1 wait' is > parent . > > web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | uniq > -c | sort -n > 1 biowr > 1 wait > 15 sbwait > 32 accept > > Kill one of frozen php-cgi processes. > > web4# kill -9 61392 > > Check again the amount of php-cgi processes, there are still 49 php-cgi > procerss. > > web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | uniq > -c | sort -n > 1 biord > 1 bo_wwa > 1 wait > 4 - > 17 sbwait > 25 accept I would recommend you try the lighttpd and fastcgi mailing lists at this point. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From bseklecki at collaborativefusion.com Tue Nov 11 09:03:09 2008 From: bseklecki at collaborativefusion.com (Brian A. Seklecki) Date: Tue Nov 11 09:03:16 2008 Subject: MFC Request In-Reply-To: <49199F76.5010800@freebsdbrasil.com.br> References: <49199F76.5010800@freebsdbrasil.com.br> Message-ID: <1226421386.27892.45.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> On Tue, 2008-11-11 at 13:06 -0200, Patrick Tracanelli wrote: > Is it possible to have traceroute MFC'd for 7.1? I would like to have -a I second that request. I'm prepared to bribe someone as well. ~BAS > and -A switchs (ASN Path mapping) available. Thank you :) > -- Brian A. Seklecki Collaborative Fusion, Inc. IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. From peter at wemm.org Tue Nov 11 10:55:47 2008 From: peter at wemm.org (Peter Wemm) Date: Tue Nov 11 10:55:54 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <4919851B.7050800@icyb.net.ua> References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> Message-ID: On Tue, Nov 11, 2008 at 5:14 AM, Andriy Gapon wrote: > on 08/11/2008 14:31 Volker said the following: >> Andriy, >> >> On 12/23/-58 20:59, Andriy Gapon wrote: >>> I have a quite strange problem. >>> This is with 7-BETA amd64. >> >> Did it work with earlier versions? > > Can't say, this is a new machine, FreeBSD took its virginity :-) > >>> All of USB is out of kernel and is loaded via modules. >>> BIOS has "Legacy USB" enabled. >>> I have only a USB keyboard, no PS/2 port. >> >> Can you check BIOS settings for EHCI handover? > > No such settings. > >> If the BIOS does not have handover enabled, it may disable legacy >> support after a timeout, which is often bad. IMO this is the same with >> booting off USB drives but every BIOS handles that different. > > This doesn't seem to be the case. The behavior is quite random, > sometimes I can work at loader prompt for may minutes, sometimes > keyboard is dead after a few seconds. > Also, I think USB keyboard is handled by UHCI, not EHCI in my case, but > I am not sure if this matters. My guess is that Legacy support should > work until OS explicitly takes over by using special procedure (this > should be done for UHCI as well). > > BTW, it seems that our UHCI take-over code is far more simple than what > MS described here: > http://www.microsoft.com/whdc/archive/usbhost.mspx#EQHAC > > Anyway, this happens after loader is done. > >>> The keyboard works file in BIOS and for selecting boot device in boot0 >>> menu. It also works in loader menu. If in the menu I select to go to >>> loader prompt then it works for about 5 seconds and then "dies" - no >>> reaction to key presses, no led change, nothing. >>> I haven't actually verified if the keyboard would still work if I stayed >>> in loader menu for longer than ~10 seconds. >>> >>> This doesn't happen if USB is built into kernel. >> >> That sound strange. I have no idea why that might work (or I'm totally >> wrong with my handover theory). > > I was incorrect about the above, I have already seen it happening both ways. > >>> Weird... >> >> Yes, sounds like or it's probably easily explainable ;) > > > -- > Andriy Gapon > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > Some bioses have a list of MBR partition id's and use that to determine what to do with the USB keyboard. One of my ol older amd64 motherboards worked but would always disable the usb keyboard right as loader started. I discovered the following: * If I put the freebsd bootblocks and loader on a floppy drive (no MBR), then the bios did not turn off the keyboard. It always continued to work for loader. * If i hacked the boot bootblocks and loader and kernel to recognize different MBR slice id nubmers as "ours", then changing the freebsd MBR to be "msdos" or "linux" also worked for that BIOS. It would no longer turn off the USB keyboard. I don't recall which Id number I used instead of 165 - it was about 4 years ago. * There were other consequences of using the partition ID hack - I think I remember it turning off the apic for msdos mode. Your problems may be different, but mine were caused by a BIOS whitelist of MBR partition id's. What a stupid problem. On that motherboard I ended up taking the path of least resistance and using the PS/2 adapter plug on the keyboard. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell From ken73.chen at gmail.com Tue Nov 11 11:14:07 2008 From: ken73.chen at gmail.com (Ken Chen) Date: Tue Nov 11 11:14:14 2008 Subject: php-cgi frozen with sbwait when SMP enable In-Reply-To: <20081111160526.GA1608@icarus.home.lan> References: <20081107142137.GA7051@icarus.home.lan> <4917285C.2030702@lozenetz.org> <20081111160526.GA1608@icarus.home.lan> Message-ID: Thank Jeremy, I will try. 2008/11/12 Jeremy Chadwick > On Tue, Nov 11, 2008 at 10:09:38PM +0800, Ken Chen wrote: > > I think the parent php-cgi are very health. I have tried: > > > > There are total 49 php-cgi processes are running or frozen, the '1 wait' > is > > parent . > > > > web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | > uniq > > -c | sort -n > > 1 biowr > > 1 wait > > 15 sbwait > > 32 accept > > > > Kill one of frozen php-cgi processes. > > > > web4# kill -9 61392 > > > > Check again the amount of php-cgi processes, there are still 49 php-cgi > > procerss. > > > > web4# ps alx | grep php-cgi | grep -v grep | awk '{print $9}' | sort | > uniq > > -c | sort -n > > 1 biord > > 1 bo_wwa > > 1 wait > > 4 - > > 17 sbwait > > 25 accept > > I would recommend you try the lighttpd and fastcgi mailing lists at this > point. > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > From geoff.sweet at x10.com Tue Nov 11 11:14:13 2008 From: geoff.sweet at x10.com (Geoff Sweet) Date: Tue Nov 11 11:14:24 2008 Subject: /usr/local/etc/rc.d getting run twice Message-ID: <1226429104.8047.5.camel@gsweet-laptop> Greetings, I have new freshly installed servers of FreeBSD 6.3. We started our migration before 6.4 was released. Anyway we have some custom tools that we have built startup scripts for and placed them into /usr/local/etc/rc.d. However the problem is that these tools get called twice to startup. Thus leaving us with two instances of the tools running. I've read a thread similar to this question in the mailing list archive however it had to do with upgrading and didn't seem to apply to my issue. I've added a buch of logic to the script to try to determine if the tool is running or not, but that seems like band aiding the issue instead of fixing the problem. Any advice? -Geoff Sweet From geoff.sweet at x10.com Tue Nov 11 11:54:32 2008 From: geoff.sweet at x10.com (Geoff Sweet) Date: Tue Nov 11 11:54:39 2008 Subject: /usr/local/etc/rc.d getting run twice In-Reply-To: <20081111192153.GI69155@bunrab.catwhisker.org> References: <1226429104.8047.5.camel@gsweet-laptop> <20081111192153.GI69155@bunrab.catwhisker.org> Message-ID: <1226433257.8047.11.camel@gsweet-laptop> Ah, indeed that seems to be the problem. Thank you, I may never have spotted that on my own. -Geoff Sweet On Tue, 2008-11-11 at 11:21 -0800, David Wolfskill wrote: > On Tue, Nov 11, 2008 at 10:45:04AM -0800, Geoff Sweet wrote: > > Greetings, I have new freshly installed servers of FreeBSD 6.3.... > > Any advice? > > Check the value of local_startup in /etc/{defaults/,}rc.conf. > > In 6.3, I believe the default was > > /usr/local/etc/rc.d /usr/X11R6/etc/rc.d > > but you may find that /usr/X11R6 has become a symlink to /usr/local > (after upgrading X11). > > If it is, you can avoid the problem by placing > > local_startup="/usr/local/etc/rc.d" > > in /etc/rc.conf. > > Peace, > david From david at catwhisker.org Tue Nov 11 11:56:22 2008 From: david at catwhisker.org (David Wolfskill) Date: Tue Nov 11 11:56:29 2008 Subject: /usr/local/etc/rc.d getting run twice In-Reply-To: <1226429104.8047.5.camel@gsweet-laptop> References: <1226429104.8047.5.camel@gsweet-laptop> Message-ID: <20081111192153.GI69155@bunrab.catwhisker.org> On Tue, Nov 11, 2008 at 10:45:04AM -0800, Geoff Sweet wrote: > Greetings, I have new freshly installed servers of FreeBSD 6.3.... > Any advice? Check the value of local_startup in /etc/{defaults/,}rc.conf. In 6.3, I believe the default was /usr/local/etc/rc.d /usr/X11R6/etc/rc.d but you may find that /usr/X11R6 has become a symlink to /usr/local (after upgrading X11). If it is, you can avoid the problem by placing local_startup="/usr/local/etc/rc.d" in /etc/rc.conf. Peace, david -- David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081111/b7cd7e3c/attachment.pgp From volker at vwsoft.com Tue Nov 11 12:09:22 2008 From: volker at vwsoft.com (Volker) Date: Tue Nov 11 12:09:35 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> Message-ID: <4919E65C.1020307@vwsoft.com> On 11/11/08 19:55, Peter Wemm wrote: > ... > * There were other consequences of using the partition ID hack - I > think I remember it turning off the apic for msdos mode. > > Your problems may be different, but mine were caused by a BIOS > whitelist of MBR partition id's. What a stupid problem. On that > motherboard I ended up taking the path of least resistance and using > the PS/2 adapter plug on the keyboard. Peter, very interesting what you've found. That reminds me on some investigations I did as I was hunting USB boot device problems. Some BIOSes do not check the partition (slice) ID but are looking for a file system magic. If a FAT filesystem is detected, the BIOS does some stupid things (like ignoring the active partition flag and booting the FAT slice no matter what you've flagged active). Just an example and off-topic to Andriy's keyboard problem. But when combining that with your findings, it may still be a thing to check for... ;) Volker From stuartb at 4gh.net Tue Nov 11 12:11:10 2008 From: stuartb at 4gh.net (Stuart Barkley) Date: Tue Nov 11 12:11:18 2008 Subject: /usr/local/etc/rc.d getting run twice In-Reply-To: <1226429104.8047.5.camel@gsweet-laptop> References: <1226429104.8047.5.camel@gsweet-laptop> Message-ID: On Tue, 11 Nov 2008 at 13:45 -0000, Geoff Sweet wrote: > Greetings, I have new freshly installed servers of FreeBSD 6.3. We > started our migration before 6.4 was released. Anyway we have some > custom tools that we have built startup scripts for and placed them > into /usr/local/etc/rc.d. However the problem is that these tools get > called twice to startup. Thus leaving us with two instances of the > tools running. You probably need the following line added to /etc/rc.conf: local_startup="/usr/local/etc/rc.d" The defaults in /etc/defaults/rc.conf still try to run startup scripts in /usr/local/etc/rc.d and /usr/X11R6/etc/rc.d, however these are probably the same location via a /usr/X11R6 link. One of the X ports attempts to add this, but I've seen cases where it doesn't get added. Stuart -- I've never been lost; I was once bewildered for three days, but never lost! -- Daniel Boone From peter at wemm.org Tue Nov 11 12:23:40 2008 From: peter at wemm.org (Peter Wemm) Date: Tue Nov 11 12:23:48 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <4919E65C.1020307@vwsoft.com> References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> <4919E65C.1020307@vwsoft.com> Message-ID: On Tue, Nov 11, 2008 at 12:09 PM, Volker wrote: > On 11/11/08 19:55, Peter Wemm wrote: >> ... >> * There were other consequences of using the partition ID hack - I >> think I remember it turning off the apic for msdos mode. >> >> Your problems may be different, but mine were caused by a BIOS >> whitelist of MBR partition id's. What a stupid problem. On that >> motherboard I ended up taking the path of least resistance and using >> the PS/2 adapter plug on the keyboard. > > Peter, > > very interesting what you've found. That reminds me on some > investigations I did as I was hunting USB boot device problems. > > Some BIOSes do not check the partition (slice) ID but are looking for a > file system magic. If a FAT filesystem is detected, the BIOS does some > stupid things (like ignoring the active partition flag and booting the > FAT slice no matter what you've flagged active). Just an example and > off-topic to Andriy's keyboard problem. > > But when combining that with your findings, it may still be a thing to > check for... ;) > > Volker I have long since stopped being surprised by what bios writers come up with. Or should I say "windows boot loader" instead of bios, because that is what it seems to have degenerated into these days. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell From koitsu at FreeBSD.org Tue Nov 11 12:24:29 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Tue Nov 11 12:24:42 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <4919E65C.1020307@vwsoft.com> References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> <4919E65C.1020307@vwsoft.com> Message-ID: <20081111202425.GA6568@icarus.home.lan> On Tue, Nov 11, 2008 at 09:09:00PM +0100, Volker wrote: > On 11/11/08 19:55, Peter Wemm wrote: > > ... > > * There were other consequences of using the partition ID hack - I > > think I remember it turning off the apic for msdos mode. > > > > Your problems may be different, but mine were caused by a BIOS > > whitelist of MBR partition id's. What a stupid problem. On that > > motherboard I ended up taking the path of least resistance and using > > the PS/2 adapter plug on the keyboard. > > Peter, > > very interesting what you've found. That reminds me on some > investigations I did as I was hunting USB boot device problems. > > Some BIOSes do not check the partition (slice) ID but are looking for a > file system magic. If a FAT filesystem is detected, the BIOS does some > stupid things (like ignoring the active partition flag and booting the > FAT slice no matter what you've flagged active). Just an example and > off-topic to Andriy's keyboard problem. > > But when combining that with your findings, it may still be a thing to > check for... ;) Since you folks in this thread have some pretty good experience with BIOS behaviour and bootloader/filesystem stuff, could I ask that someone take a look at something I posted at over on -fs? I'm out of ideas at this point. http://lists.freebsd.org/pipermail/freebsd-fs/2008-November/005317.html Danke! -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From nakal at web.de Tue Nov 11 13:00:48 2008 From: nakal at web.de (Martin) Date: Tue Nov 11 13:00:56 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> Message-ID: <20081111213344.6657548c@zelda.local> Am Tue, 11 Nov 2008 10:55:45 -0800 schrieb "Peter Wemm" : > Some bioses have a list of MBR partition id's and use that to > determine what to do with the USB keyboard. One of my ol older amd64 > motherboards worked but would always disable the usb keyboard right as > loader started. I discovered the following: > * If I put the freebsd bootblocks and loader on a floppy drive (no > MBR), then the bios did not turn off the keyboard. It always > continued to work for loader. > * If i hacked the boot bootblocks and loader and kernel to recognize > different MBR slice id nubmers as "ours", then changing the freebsd > MBR to be "msdos" or "linux" also worked for that BIOS. It would no > longer turn off the USB keyboard. I don't recall which Id number I > used instead of 165 - it was about 4 years ago. > * There were other consequences of using the partition ID hack - I > think I remember it turning off the apic for msdos mode. > > Your problems may be different, but mine were caused by a BIOS > whitelist of MBR partition id's. What a stupid problem. On that > motherboard I ended up taking the path of least resistance and using > the PS/2 adapter plug on the keyboard. Hello, I want to add some information about USB problems which occur for me very frequently. I have found out that most of the problems are related to Gigabyte mainboards. I have 2 of them now. One is "EP35C-DS3R". With this mainboard sometimes my USB keyboard and USB mouse stop working (the power is simply off). I can reattach them and they both power up again. The second mainboard is "EP45-DS3R". Here the problem is even worse. The keyboard and mouse (both USB) lose power as soon as FreeBSD scans the USB controllers. Here, I can also reattach the devices and they are usable again. One further hint: it seems Vista (64 bit version) has the same problem with this EP45-DS3R mainboard. After it boots into the login screen, I have to reattach the devices to use them. The mainboard is not broken, I have tried 3 so far and all have these strange effects. And now... I want to remind you that I have already posted here about (same) USB problems on my laptop (Lenovo Thinkpad T60p). Sometimes I have to reattach my keyboard there, too. Of course, this is not Gigabyte here, but the weird behaviour ressembles the one above. -- Martin From pluknet at gmail.com Wed Nov 12 02:06:36 2008 From: pluknet at gmail.com (pluknet) Date: Wed Nov 12 02:06:43 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: <20081111213344.6657548c@zelda.local> References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> <20081111213344.6657548c@zelda.local> Message-ID: 2008/11/11 Martin : > Am Tue, 11 Nov 2008 10:55:45 -0800 > schrieb "Peter Wemm" : > >> Some bioses have a list of MBR partition id's and use that to >> determine what to do with the USB keyboard. One of my ol older amd64 >> motherboards worked but would always disable the usb keyboard right as >> loader started. I discovered the following: >> * If I put the freebsd bootblocks and loader on a floppy drive (no >> MBR), then the bios did not turn off the keyboard. It always >> continued to work for loader. >> * If i hacked the boot bootblocks and loader and kernel to recognize >> different MBR slice id nubmers as "ours", then changing the freebsd >> MBR to be "msdos" or "linux" also worked for that BIOS. It would no >> longer turn off the USB keyboard. I don't recall which Id number I >> used instead of 165 - it was about 4 years ago. >> * There were other consequences of using the partition ID hack - I >> think I remember it turning off the apic for msdos mode. >> >> Your problems may be different, but mine were caused by a BIOS >> whitelist of MBR partition id's. What a stupid problem. On that >> motherboard I ended up taking the path of least resistance and using >> the PS/2 adapter plug on the keyboard. > > Hello, > > I want to add some information about USB problems which occur for me > very frequently. > > I have found out that most of the problems are related to Gigabyte > mainboards. I have 2 of them now. One is "EP35C-DS3R". With this > mainboard sometimes my USB keyboard and USB mouse stop working (the > power is simply off). I can reattach them and they both power up again. > > The second mainboard is "EP45-DS3R". Here the problem is even worse. > The keyboard and mouse (both USB) lose power as soon as FreeBSD scans > the USB controllers. Here, I can also reattach the devices and they are > usable again. > > One further hint: it seems Vista (64 bit version) has the same problem > with this EP45-DS3R mainboard. After it boots into the login screen, I > have to reattach the devices to use them. The mainboard is not broken, > I have tried 3 so far and all have these strange effects. > > > And now... I want to remind you that I have already posted here about > (same) USB problems on my laptop (Lenovo Thinkpad T60p). Sometimes I > have to reattach my keyboard there, too. Of course, this is not > Gigabyte here, but the weird behaviour ressembles the one above. I have the same problem with my ukbd&ums: they are power off'ed during the boot and I should to re-attach them . MB: Asus p5k. -- wbr, pluknet From avg at icyb.net.ua Wed Nov 12 03:29:18 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 12 03:29:31 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> Message-ID: <491ABE05.5080809@icyb.net.ua> on 11/11/2008 20:55 Peter Wemm said the following: > Some bioses have a list of MBR partition id's and use that to > determine what to do with the USB keyboard. One of my ol older amd64 > motherboards worked but would always disable the usb keyboard right as > loader started. I discovered the following: > * If I put the freebsd bootblocks and loader on a floppy drive (no > MBR), then the bios did not turn off the keyboard. It always > continued to work for loader. > * If i hacked the boot bootblocks and loader and kernel to recognize > different MBR slice id nubmers as "ours", then changing the freebsd > MBR to be "msdos" or "linux" also worked for that BIOS. It would no > longer turn off the USB keyboard. I don't recall which Id number I > used instead of 165 - it was about 4 years ago. > * There were other consequences of using the partition ID hack - I > think I remember it turning off the apic for msdos mode. > > Your problems may be different, but mine were caused by a BIOS > whitelist of MBR partition id's. What a stupid problem. On that > motherboard I ended up taking the path of least resistance and using > the PS/2 adapter plug on the keyboard. Foul play on BIOS part is definitely a big possibility. What puzzles me most is random/inconsistent behavior from boot to boot. Maybe there is some misalignment between how BIOS emulates legacy keyboard and how our boot chain interacts with it, some timing issue or something. Anyway, this is very hard to debug or guess. Most probably I will have to live with it (this system doesn't have PS/2 ports at all). -- Andriy Gapon From avg at icyb.net.ua Wed Nov 12 03:36:50 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 12 03:37:05 2008 Subject: ukbd attachment and root mount In-Reply-To: <4911BA93.9030006@icyb.net.ua> References: <4911BA93.9030006@icyb.net.ua> Message-ID: <491ABFCD.3060309@icyb.net.ua> on 05/11/2008 17:24 Andriy Gapon said the following: > System is FreeBSD 7.1-BETA2 amd64. > > Looking through my dmesg I see that relative order of ukbd attachment > and root mounting is not deterministic. Sometime keyboard is attached > first, sometimes root filesystem is mounted first. Quite more often root > is mounted first, though. > Example (with GENERIC kernel): > Nov 3 15:40:54 kernel: Trying to mount root from ufs:/dev/mirror/bootgm > Nov 3 15:40:54 kernel: GEOM_LABEL: Label ufs/bootfs removed. > Nov 3 15:40:54 kernel: GEOM_LABEL: Label for provider mirror/bootgm is > ufs/bootfs. > Nov 3 15:40:54 kernel: GEOM_LABEL: Label ufs/bootfs removed. > Nov 3 15:40:54 kernel: ukbd0: 1.10/1.10, addr 3> on uhub2 > Nov 3 15:40:54 kernel: kbd2 at ukbd0 > Nov 3 15:40:54 kernel: uhid0: 1.10/1.10, addr 3> on uhub2 > > Another (with custom kernel, zfs root): > Nov 4 17:54:03 odyssey kernel: Trying to mount root from zfs:tank/root > Nov 4 17:54:03 odyssey kernel: ukbd0: rev 1.10/1.10, addr 3> on uhub2 > Nov 4 17:54:03 odyssey kernel: kbd2 at ukbd0 > Nov 4 17:54:03 odyssey kernel: kbd2: ukbd0, generic (0), config:0x0, > flags:0x3d0000 > Nov 4 17:54:03 odyssey kernel: uhid0: rev 1.10/1.10, addr 3> on uhub2 > > I have a legacy-free system (no PS/2 ports, only USB) and I wanted to > try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was > bitten hard when I made a mistake and kernel could not find/mount root > filesystem. > > So I stuck at mountroot prompt without a keyboard to enter anything. > This was repeatable about 10 times after which I resorted to live cd. > > Since then I put back atkbdc into my kernel. I guess BIOS or USB > hardware emulate AT or PS/2 keyboard, so the USB keyboard works before > the driver attaches. I guess I need such emulation e.g. for loader or > boot0 configuration. But I guess I don't have to have atkbd driver in > kernel. This turned out not to be a complete solution as it seems that there are some quirks about legacy USB here, sometimes keyboard stops working even at loader prompt (this is described in a different thread). ukbd attachment still puzzles me a lot. I look at some older dmesg, e.g. this 7.0-RELEASE one: http://www.mavetju.org/mail/view_message.php?list=freebsd-usb&id=2709973 and see that ukbd attaches along with ums before mountroot. I look at newer dmesg and I see that ums attaches at about the same time as before but ukbd consistently attaches after mountroot. I wonder what might cause such behavior and how to fix it. I definitely would like to see ukbd attach before mountroot, I can debug this issue, but need some hints on where to start. -- Andriy Gapon From avg at icyb.net.ua Wed Nov 12 04:00:21 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 12 04:00:31 2008 Subject: ukbd attachment and root mount In-Reply-To: References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> Message-ID: <491AC502.9000507@icyb.net.ua> on 12/11/2008 13:53 Nate Eldredge said the following: > On Wed, 12 Nov 2008, Andriy Gapon wrote: > >> on 05/11/2008 17:24 Andriy Gapon said the following: > [...] >>> I have a legacy-free system (no PS/2 ports, only USB) and I wanted to >>> try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was >>> bitten hard when I made a mistake and kernel could not find/mount root >>> filesystem. >>> >>> So I stuck at mountroot prompt without a keyboard to enter anything. >>> This was repeatable about 10 times after which I resorted to live cd. >>> >>> Since then I put back atkbdc into my kernel. I guess BIOS or USB >>> hardware emulate AT or PS/2 keyboard, so the USB keyboard works before >>> the driver attaches. I guess I need such emulation e.g. for loader or >>> boot0 configuration. But I guess I don't have to have atkbd driver in >>> kernel. >> >> This turned out not to be a complete solution as it seems that there are >> some quirks about legacy USB here, sometimes keyboard stops working even >> at loader prompt (this is described in a different thread). >> >> ukbd attachment still puzzles me a lot. >> I look at some older dmesg, e.g. this 7.0-RELEASE one: >> http://www.mavetju.org/mail/view_message.php?list=freebsd-usb&id=2709973 >> and see that ukbd attaches along with ums before mountroot. >> >> I look at newer dmesg and I see that ums attaches at about the same time >> as before but ukbd consistently attaches after mountroot. >> I wonder what might cause such behavior and how to fix it. >> I definitely would like to see ukbd attach before mountroot, I can debug >> this issue, but need some hints on where to start. > > I haven't been following this thread, and I'm pretty sleepy right now, > so sorry if this is irrelevant, but I had a somewhat similar problem > that was fixed by adding > > hint.atkbd.0.flags="0x1" > > to /boot/device.hints . > I can try this, but I think this wouldn't help for two reasons: 1. I already tried kernel without atkb at all 2. if ukbd driver is not attached then I don't see any way USB keyboard would work in non-legacy way Anyway I will try this, thank you. -- Andriy Gapon From koitsu at FreeBSD.org Wed Nov 12 04:14:12 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 12 04:14:19 2008 Subject: ukbd attachment and root mount In-Reply-To: <491AC502.9000507@icyb.net.ua> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> Message-ID: <20081112121410.GA24629@icarus.home.lan> On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote: > on 12/11/2008 13:53 Nate Eldredge said the following: > > On Wed, 12 Nov 2008, Andriy Gapon wrote: > > > >> on 05/11/2008 17:24 Andriy Gapon said the following: > > [...] > >>> I have a legacy-free system (no PS/2 ports, only USB) and I wanted to > >>> try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was > >>> bitten hard when I made a mistake and kernel could not find/mount root > >>> filesystem. > >>> > >>> So I stuck at mountroot prompt without a keyboard to enter anything. > >>> This was repeatable about 10 times after which I resorted to live cd. > >>> > >>> Since then I put back atkbdc into my kernel. I guess BIOS or USB > >>> hardware emulate AT or PS/2 keyboard, so the USB keyboard works before > >>> the driver attaches. I guess I need such emulation e.g. for loader or > >>> boot0 configuration. But I guess I don't have to have atkbd driver in > >>> kernel. > >> > >> This turned out not to be a complete solution as it seems that there are > >> some quirks about legacy USB here, sometimes keyboard stops working even > >> at loader prompt (this is described in a different thread). > >> > >> ukbd attachment still puzzles me a lot. > >> I look at some older dmesg, e.g. this 7.0-RELEASE one: > >> http://www.mavetju.org/mail/view_message.php?list=freebsd-usb&id=2709973 > >> and see that ukbd attaches along with ums before mountroot. > >> > >> I look at newer dmesg and I see that ums attaches at about the same time > >> as before but ukbd consistently attaches after mountroot. > >> I wonder what might cause such behavior and how to fix it. > >> I definitely would like to see ukbd attach before mountroot, I can debug > >> this issue, but need some hints on where to start. > > > > I haven't been following this thread, and I'm pretty sleepy right now, > > so sorry if this is irrelevant, but I had a somewhat similar problem > > that was fixed by adding > > > > hint.atkbd.0.flags="0x1" > > > > to /boot/device.hints . To those reading, the above setting enables the following option: bit 0 (FAIL_IF_NO_KBD) By default the atkbd driver will install even if a keyboard is not actually connected to the system. This option prevents the driver from being installed in this situation. > I can try this, but I think this wouldn't help for two reasons: > 1. I already tried kernel without atkb at all > 2. if ukbd driver is not attached then I don't see any way USB keyboard > would work in non-legacy way Regarding #2: at which stage? boot0/boot2/loader require an AT or PS/2 keyboard to work. None of these stages use ukbd(4) or anything -- there is no kernel loaded at this point!! Meaning: if you have a USB keyboard, your BIOS will need to have a "USB Legacy" option to cause it to act as a PS/2 keyboard, for typing in boot0/boot2/loader to work. Device hints are for kernel drivers, once the kernel is loaded. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From neldredge at math.ucsd.edu Wed Nov 12 04:18:37 2008 From: neldredge at math.ucsd.edu (Nate Eldredge) Date: Wed Nov 12 04:18:48 2008 Subject: ukbd attachment and root mount In-Reply-To: <491ABFCD.3060309@icyb.net.ua> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> Message-ID: On Wed, 12 Nov 2008, Andriy Gapon wrote: > on 05/11/2008 17:24 Andriy Gapon said the following: [...] >> I have a legacy-free system (no PS/2 ports, only USB) and I wanted to >> try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was >> bitten hard when I made a mistake and kernel could not find/mount root >> filesystem. >> >> So I stuck at mountroot prompt without a keyboard to enter anything. >> This was repeatable about 10 times after which I resorted to live cd. >> >> Since then I put back atkbdc into my kernel. I guess BIOS or USB >> hardware emulate AT or PS/2 keyboard, so the USB keyboard works before >> the driver attaches. I guess I need such emulation e.g. for loader or >> boot0 configuration. But I guess I don't have to have atkbd driver in >> kernel. > > This turned out not to be a complete solution as it seems that there are > some quirks about legacy USB here, sometimes keyboard stops working even > at loader prompt (this is described in a different thread). > > ukbd attachment still puzzles me a lot. > I look at some older dmesg, e.g. this 7.0-RELEASE one: > http://www.mavetju.org/mail/view_message.php?list=freebsd-usb&id=2709973 > and see that ukbd attaches along with ums before mountroot. > > I look at newer dmesg and I see that ums attaches at about the same time > as before but ukbd consistently attaches after mountroot. > I wonder what might cause such behavior and how to fix it. > I definitely would like to see ukbd attach before mountroot, I can debug > this issue, but need some hints on where to start. I haven't been following this thread, and I'm pretty sleepy right now, so sorry if this is irrelevant, but I had a somewhat similar problem that was fixed by adding hint.atkbd.0.flags="0x1" to /boot/device.hints . -- Nate Eldredge neldredge@math.ucsd.edu From avg at icyb.net.ua Wed Nov 12 04:20:46 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 12 04:21:04 2008 Subject: ukbd attachment and root mount In-Reply-To: <20081112121410.GA24629@icarus.home.lan> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> Message-ID: <491ACA19.2040008@icyb.net.ua> on 12/11/2008 14:14 Jeremy Chadwick said the following: > On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote: [snip] >> 2. if ukbd driver is not attached then I don't see any way USB keyboard >> would work in non-legacy way > > Regarding #2: at which stage? boot0/boot2/loader require an AT or PS/2 > keyboard to work. None of these stages use ukbd(4) or anything -- there > is no kernel loaded at this point!! Meaning: if you have a USB keyboard, > your BIOS will need to have a "USB Legacy" option to cause it to act as > a PS/2 keyboard, for typing in boot0/boot2/loader to work. > > Device hints are for kernel drivers, once the kernel is loaded. Jeremy, I understand all of this. In subject line and earlier messages I say that I am interested in mountroot prompt - the prompt where kernel can ask about what device to use for root filesystem. Essentially I would like kernel to recognize USB keyboard (and disable all the legacy stuff if needed) before it prompts for the root device. -- Andriy Gapon From koitsu at FreeBSD.org Wed Nov 12 04:33:17 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 12 04:33:35 2008 Subject: ukbd attachment and root mount In-Reply-To: <491ACA19.2040008@icyb.net.ua> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> Message-ID: <20081112123315.GA24907@icarus.home.lan> On Wed, Nov 12, 2008 at 02:20:41PM +0200, Andriy Gapon wrote: > on 12/11/2008 14:14 Jeremy Chadwick said the following: > > On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote: > [snip] > >> 2. if ukbd driver is not attached then I don't see any way USB keyboard > >> would work in non-legacy way > > > > Regarding #2: at which stage? boot0/boot2/loader require an AT or PS/2 > > keyboard to work. None of these stages use ukbd(4) or anything -- there > > is no kernel loaded at this point!! Meaning: if you have a USB keyboard, > > your BIOS will need to have a "USB Legacy" option to cause it to act as > > a PS/2 keyboard, for typing in boot0/boot2/loader to work. > > > > Device hints are for kernel drivers, once the kernel is loaded. > > Jeremy, > > I understand all of this. > In subject line and earlier messages I say that I am interested in > mountroot prompt - the prompt where kernel can ask about what device to > use for root filesystem. > Essentially I would like kernel to recognize USB keyboard (and disable > all the legacy stuff if needed) before it prompts for the root device. I fully understand that fact. However, I don't see the logic in that statement. You should be able to remove and add a keyboard at any time and be able to type immediately. Meaning: I don't see why when the keyboard recognition is performed (e.g. before printing mountroot or after) matters. It should not. I think this is a red herring. I've seen the problem where I have a fully functional USB keyboard in boot0/boot2/loader and in multi-user, but when booting into single-user or when getting a mountroot prompt, the keyboard does not function. When the mountroot prompt is printed (before or after ukbd attached) makes no difference for me in this scenario -- I tested it many times. It's very possible that "something" (kbdcontrol?) is getting run only during late stages of multi-user, which makes the keyboard work. But prior to that "something" being run (but AFTER boot2/loader), the keyboard is not truly usable. I hope everyone here is also aware of that fact that not all keyboards are created equal. Case in point (and this reason is exactly why I am purchasing a native PS/2 keyboard, as USB4BSD doesn't work with all USB keyboards right now): http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000219.html The bottom line: FreeBSD cannot be reliably used with a USB keyboard in all circumstances. And that is a very sad reality, because 90% of the keyboards you find on the consumer and enterprise market are USB -- native PS/2 keyboards are now a scarcity. Do not even for a minute tell me "buy a USB-to-PS2 adapter", because the "green ones" that come with USB mice do not work with USB keyboards. I have even bought a "purple" USB-to-PS2 keyboard adapter from Amazon, specifically for this purpose, and it *does not work*. I found out weeks later the adapters only work on CERTAIN models of USB keyboards, depending upon how they're engineered. What really needs to happen here should be obvious: we need some form of inexpensive keyboard-only USB support in boot2/loader. I would *love* to know how Linux and Windows solve this problem. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From avg at icyb.net.ua Wed Nov 12 04:49:20 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 12 04:50:37 2008 Subject: ukbd attachment and root mount In-Reply-To: <20081112123315.GA24907@icarus.home.lan> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> <20081112123315.GA24907@icarus.home.lan> Message-ID: <491AD0CB.8050309@icyb.net.ua> on 12/11/2008 14:33 Jeremy Chadwick said the following: > On Wed, Nov 12, 2008 at 02:20:41PM +0200, Andriy Gapon wrote: >> on 12/11/2008 14:14 Jeremy Chadwick said the following: >>> On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote: >> [snip] >>>> 2. if ukbd driver is not attached then I don't see any way USB keyboard >>>> would work in non-legacy way >>> Regarding #2: at which stage? boot0/boot2/loader require an AT or PS/2 >>> keyboard to work. None of these stages use ukbd(4) or anything -- there >>> is no kernel loaded at this point!! Meaning: if you have a USB keyboard, >>> your BIOS will need to have a "USB Legacy" option to cause it to act as >>> a PS/2 keyboard, for typing in boot0/boot2/loader to work. >>> >>> Device hints are for kernel drivers, once the kernel is loaded. >> Jeremy, >> >> I understand all of this. >> In subject line and earlier messages I say that I am interested in >> mountroot prompt - the prompt where kernel can ask about what device to >> use for root filesystem. >> Essentially I would like kernel to recognize USB keyboard (and disable >> all the legacy stuff if needed) before it prompts for the root device. > > I fully understand that fact. However, I don't see the logic in that > statement. You should be able to remove and add a keyboard at any time > and be able to type immediately. Meaning: I don't see why when the > keyboard recognition is performed (e.g. before printing mountroot or > after) matters. It should not. I think this is a red herring. I think that this does matter because keyboard recognition is performed after the 'mounting from' log line *only if* root mount is done automatically. If there is an actual interactive prompt then recognition is not performed, at least I do not see any relevant lines on the screen and I am stuck at the prompt. > I've seen the problem where I have a fully functional USB keyboard in > boot0/boot2/loader For me it even randomly dies at these stages. I reported this in a different thread. But this should not be related to kernel behavior. >and in multi-user, For me this always works. > but when booting into single-user For me this always works. > or when getting a mountroot prompt, the keyboard does not function. > When the mountroot prompt is printed (before or after ukbd attached) > makes no difference for me in this scenario -- I tested it many times. For me ukbd lines are never printed if I get actual interactive mountroot prompt. > It's very possible that "something" (kbdcontrol?) is getting run only > during late stages of multi-user, which makes the keyboard work. But > prior to that "something" being run (but AFTER boot2/loader), the > keyboard is not truly usable. For me this is not true. My keyboard always works after ukbd lines appear on screen. > I hope everyone here is also aware of that fact that not all keyboards > are created equal. Case in point (and this reason is exactly why I > am purchasing a native PS/2 keyboard, as USB4BSD doesn't work with > all USB keyboards right now): For me this is not an option, no PS/2 ports. > http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000219.html > > The bottom line: > > FreeBSD cannot be reliably used with a USB keyboard in all > circumstances.And that is a very sad reality, because 90% of the > keyboards you find on the consumer and enterprise market are USB -- > native PS/2 keyboards are now a scarcity. I agree that this is a sad reality but only for boot stages where we depend on external entity named BIOS to help us. This doesn't have to be a sad reality once kernel takes control. USB support in boot chain - I don't know - this would be great of course but that's a lot of code. -- Andriy Gapon From koitsu at FreeBSD.org Wed Nov 12 05:21:26 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 12 05:21:45 2008 Subject: ukbd attachment and root mount In-Reply-To: <491AD0CB.8050309@icyb.net.ua> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> <20081112123315.GA24907@icarus.home.lan> <491AD0CB.8050309@icyb.net.ua> Message-ID: <20081112132124.GA25637@icarus.home.lan> On Wed, Nov 12, 2008 at 02:49:15PM +0200, Andriy Gapon wrote: > on 12/11/2008 14:33 Jeremy Chadwick said the following: > > On Wed, Nov 12, 2008 at 02:20:41PM +0200, Andriy Gapon wrote: > >> on 12/11/2008 14:14 Jeremy Chadwick said the following: > >>> On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote: > >> [snip] > >>>> 2. if ukbd driver is not attached then I don't see any way USB keyboard > >>>> would work in non-legacy way > >>> Regarding #2: at which stage? boot0/boot2/loader require an AT or PS/2 > >>> keyboard to work. None of these stages use ukbd(4) or anything -- there > >>> is no kernel loaded at this point!! Meaning: if you have a USB keyboard, > >>> your BIOS will need to have a "USB Legacy" option to cause it to act as > >>> a PS/2 keyboard, for typing in boot0/boot2/loader to work. > >>> > >>> Device hints are for kernel drivers, once the kernel is loaded. > >> Jeremy, > >> > >> I understand all of this. > >> In subject line and earlier messages I say that I am interested in > >> mountroot prompt - the prompt where kernel can ask about what device to > >> use for root filesystem. > >> Essentially I would like kernel to recognize USB keyboard (and disable > >> all the legacy stuff if needed) before it prompts for the root device. > > > > I fully understand that fact. However, I don't see the logic in that > > statement. You should be able to remove and add a keyboard at any time > > and be able to type immediately. Meaning: I don't see why when the > > keyboard recognition is performed (e.g. before printing mountroot or > > after) matters. It should not. I think this is a red herring. > > I think that this does matter because keyboard recognition is performed > after the 'mounting from' log line *only if* root mount is done > automatically. > If there is an actual interactive prompt then recognition is not > performed, at least I do not see any relevant lines on the screen and I > am stuck at the prompt. > > > I've seen the problem where I have a fully functional USB keyboard in > > boot0/boot2/loader > > For me it even randomly dies at these stages. > I reported this in a different thread. > But this should not be related to kernel behavior. > > >and in multi-user, > > For me this always works. > > > but when booting into single-user > > For me this always works. > > > or when getting a mountroot prompt, the keyboard does not function. > > When the mountroot prompt is printed (before or after ukbd attached) > > makes no difference for me in this scenario -- I tested it many times. > > For me ukbd lines are never printed if I get actual interactive > mountroot prompt. > > > It's very possible that "something" (kbdcontrol?) is getting run only > > during late stages of multi-user, which makes the keyboard work. But > > prior to that "something" being run (but AFTER boot2/loader), the > > keyboard is not truly usable. > > For me this is not true. My keyboard always works after ukbd lines > appear on screen. I've pointed you to evidence where this isn't true, especially when using the USB4BSD stack. There is something called "boot legacy protocol" which USB keyboards have to support to properly be interfaced with in FreeBSD using the USB4BSD stack; in the case of the Microsoft Natural Ergo 4000 keyboard, it does not play well with USB4BSD (it DOES work with the old USB stack, but none of the multimedia keys work, and worse, the F-Lock key does not work; this is because those keys use uhid(4) and not ukbd(4)). Linux has a __20 page Wiki document__ on **just this keyboard**. That should give you some idea of how complex the situation with USB keyboards is in general. http://www.gentoo-wiki.info/HOWTO_Microsoft_Natural_Ergonomic_Keyboard_4000 > > I hope everyone here is also aware of that fact that not all keyboards > > are created equal. Case in point (and this reason is exactly why I > > am purchasing a native PS/2 keyboard, as USB4BSD doesn't work with > > all USB keyboards right now): > > For me this is not an option, no PS/2 ports. I don't know what to say to ***ANY*** of the above, other than this: No one is doing anything about this problem because there does not appear to be a 100% reproducible always-screws-up-when-I-do-this scenario that happens to *every FreeBSD user*. Until we settle down, stop replying to Emails with one-liner injections, and compile a list of test scenarios/cases that people can perform, and get these people to provide both 1) full hardware details, 2) full kernel configuration files, 3) full loader.conf files, and 4) full device.hints files, we're not going to get anywhere. > > http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000219.html > > > > The bottom line: > > > > FreeBSD cannot be reliably used with a USB keyboard in all > > circumstances.And that is a very sad reality, because 90% of the > > keyboards you find on the consumer and enterprise market are USB -- > > native PS/2 keyboards are now a scarcity. > > I agree that this is a sad reality but only for boot stages where we > depend on external entity named BIOS to help us. > This doesn't have to be a sad reality once kernel takes control. It's been confirmed by numerous people now, including #bsdports users, that "USB Legacy" does not work for some individuals. This is either because of BIOS bugs, or because the USB keyboards do not support tying into SMM. We don't know the true cause. One thing we do know: we have FreeBSD users stating they cannot type in boot0/boot2/loader, even with USB Legacy enabled, so going into single-user after a reboot is impossible. Another thing we do know: we have FreeBSD users who do not have fully functional USB keyboards in FreeBSD (some see ukbd attach, others do not; some are using USB4BSD, others are not). So, can someone take the time to come up with test scenarios/cases so that users can perform these tests, list off the exact hardware they have, and we can see if there is a consistent/common failure between everyone? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From avg at icyb.net.ua Wed Nov 12 05:33:52 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Nov 12 05:34:00 2008 Subject: ukbd attachment and root mount In-Reply-To: <20081112132124.GA25637@icarus.home.lan> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> <20081112123315.GA24907@icarus.home.lan> <491AD0CB.8050309@icyb.net.ua> <20081112132124.GA25637@icarus.home.lan> Message-ID: <491ADB3B.2090000@icyb.net.ua> on 12/11/2008 15:21 Jeremy Chadwick said the following: > I don't know what to say to ***ANY*** of the above, other than this: > > No one is doing anything about this problem because there does not > appear to be a 100% reproducible always-screws-up-when-I-do-this > scenario that happens to *every FreeBSD user*. > > Until we settle down, stop replying to Emails with one-liner injections, > and compile a list of test scenarios/cases that people can perform, and > get these people to provide both 1) full hardware details, 2) full > kernel configuration files, 3) full loader.conf files, and 4) full > device.hints files, we're not going to get anywhere. Well I started two separate threads. This thread is about one very specific issue - ukbd attaching after mountroot code. Again, in this thread I am only interested in getting ukbd to attach before the mount root. I am not interested in BIOS, boot chain, etc. I am not even interested in speculations about whether keyboard would work or not at mountroot prompt if it were attaching before it. -- Andriy Gapon From freebsd-stable at epcdirect.co.uk Wed Nov 12 08:18:13 2008 From: freebsd-stable at epcdirect.co.uk (Lawrence Farr) Date: Wed Nov 12 08:18:21 2008 Subject: Cannot see disks attached to Marvell controller Message-ID: <005601c944df$49e17d40$dda477c0$@co.uk> I've got an Asus P5E3WSPro with 8 SATA ports and 8 disks attached. 6 disks are on one controller (and work perfectly) and 2 are on a second Marvell controller. FreeBSD sees the controller, but not the disks. If I move a working disk to the Marvell controller I can boot off it up to the stage of mounting root where it fails to see the disk. atacontrol shows: ATA channel 2: Master: no device present Slave: no device present ATA channel 3: Master: no device present Slave: no device present ATA channel 4: Master: ad8 Serial ATA II Slave: no device present ATA channel 5: Master: ad10 Serial ATA II Slave: no device present ATA channel 6: Master: ad12 Serial ATA II Slave: no device present ATA channel 7: Master: ad14 Serial ATA II Slave: no device present ATA channel 8: Master: ad16 Serial ATA II Slave: no device present ATA channel 9: Master: ad18 Serial ATA II Slave: no device present Here's the pciconf listing: atapci1@pci0:0:31:2: class=0x010601 card=0x82771043 chip=0x29228086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801IB/IR/IH (ICH9 Family) 6 port SATA AHCI Controller' class = mass storage -- atapci0@pci0:1:0:0: class=0x01048f card=0x82201043 chip=0x614511ab rev=0xa1 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '? Add-on IC to provide 4x SATA Ports, attached to ICH7 (SthBridge?) via PCI-Express.' class = mass storage Anyone got any ideas why they don't show up? I've tried every BIOS option with the controller, Ie raid mode etc. From babkin at verizon.net Wed Nov 12 06:11:53 2008 From: babkin at verizon.net (Sergey Babkin) Date: Wed Nov 12 08:48:03 2008 Subject: ukbd attachment and root mount References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> <20081112123315.GA24907@icarus.home.lan> Message-ID: <491AD7BB.2EAA9AA0@verizon.net> Jeremy Chadwick wrote: > > What really needs to happen here should be obvious: we need some form of > inexpensive keyboard-only USB support in boot2/loader. > > I would *love* to know how Linux and Windows solve this problem. If I remember right, UnixWare used(s) the BIOS calls in the loader. -SB From nakal at web.de Wed Nov 12 08:52:30 2008 From: nakal at web.de (Martin) Date: Wed Nov 12 08:52:38 2008 Subject: usb keyboard dying at loader prompt In-Reply-To: References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> <20081111213344.6657548c@zelda.local> Message-ID: <20081112175217.1b37caf1@zelda.local> Am Wed, 12 Nov 2008 12:36:19 +0300 schrieb pluknet : > I have the same problem with my ukbd&ums: > they are power off'ed during the boot and I should to re-attach them . > MB: Asus p5k. Hi, I've noticed one thing today. I can switch off USB-Keyboard support in my BIOS. In this case, I cannot use my keyboard during boot prompt, but FreeBSD at least initializes the USB-controller correctly, so I can use it later when entering my geli partition password. It seems the BIOS on some mainboards puts the USB controller in a state from which FreeBSD cannot initialize the hardware anymore. And, let's not forget, there is the second problem with devices that suddenly power off at apparently random times. -- Martin -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081112/85a52bdc/signature.pgp From nakal at web.de Wed Nov 12 09:30:17 2008 From: nakal at web.de (Martin) Date: Wed Nov 12 09:34:12 2008 Subject: ukbd attachment and root mount In-Reply-To: <20081112132124.GA25637@icarus.home.lan> References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> <20081112123315.GA24907@icarus.home.lan> <491AD0CB.8050309@icyb.net.ua> <20081112132124.GA25637@icarus.home.lan> Message-ID: <20081112183012.57af6eb5@zelda.local> Am Wed, 12 Nov 2008 05:21:24 -0800 schrieb Jeremy Chadwick : > Until we settle down, stop replying to Emails with one-liner > injections, and compile a list of test scenarios/cases that people > can perform, and get these people to provide both 1) full hardware > details, 2) full kernel configuration files, 3) full loader.conf > files, and 4) full device.hints files, we're not going to get > anywhere. Ok, I will add the details for the GA-EP45-DS3R based system. 1) dmesg Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.1-PRERELEASE #0: Mon Nov 10 08:23:21 CET 2008 root@kirby:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz (3166.32-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features=0xbfebfbff Features2=0x8e3fd> AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 2 usable memory = 8574255104 (8177 MB) avail memory = 8286810112 (7902 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Changing APIC ID to 2 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, cfdb0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 16 at device 1.0 on pci0 pci1: on pcib1 vgapci0: port 0xa000-0xa0ff mem 0xd0000000-0xdfffffff,0xe5000000-0xe500ffff irq 16 at device 0.0 on pci1 pcm0: mem 0xe5010000-0xe5013fff irq 17 at device 0.1 on pci1 pcm0: [ITHREAD] uhci0: port 0xe000-0xe01f irq 16 at device 26.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xe100-0xe11f irq 21 at device 26.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xe200-0xe21f irq 18 at device 26.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xe9305000-0xe93053ff irq 18 at device 26.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: on usb3 uhub3: 6 ports with 6 removable, self powered pcm1: mem 0xe9300000-0xe9303fff irq 22 at device 27.0 on pci0 pcm1: [ITHREAD] pcib2: irq 16 at device 28.0 on pci0 pci2: on pcib2 pcib3: irq 19 at device 28.3 on pci0 pci3: on pcib3 atapci0: port 0xb000-0xb007,0xb100-0xb103,0xb200-0xb207,0xb300-0xb303,0xb400-0xb40f irq 19 at device 0.0 on pci3 atapci0: [ITHREAD] ata2: on atapci0 ata2: [ITHREAD] pcib4: irq 16 at device 28.4 on pci0 pci4: on pcib4 re0: port 0xc000-0xc0ff mem Ethernet> 0xe9010000-0xe9010fff,0xe9000000-0xe900ffff irq 16 at device Ethernet> 0.0 on pci4 re0: Chip rev. 0x3c000000 re0: MAC rev. 0x00400000 miibus0: on re0 rgephy0: PHY 1 on miibus0 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto re0: Ethernet address: 00:1f:d0:24:96:ab re0: [FILTER] pcib5: irq 17 at device 28.5 on pci0 pci5: on pcib5 re1: port 0xd000-0xd0ff mem Ethernet> 0xe9110000-0xe9110fff,0xe9100000-0xe910ffff irq 17 at device Ethernet> 0.0 on pci5 re1: Chip rev. 0x3c000000 re1: MAC rev. 0x00400000 miibus1: on re1 rgephy1: PHY 1 on miibus1 rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto re1: Ethernet address: 00:1f:d0:24:96:a9 re1: [FILTER] uhci3: port 0xe300-0xe31f irq 23 at device 29.0 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb4: on uhci3 usb4: USB revision 1.0 uhub4: on usb4 uhub4: 2 ports with 2 removable, self powered uhci4: port 0xe400-0xe41f irq 19 at device 29.1 on pci0 uhci4: [GIANT-LOCKED] uhci4: [ITHREAD] usb5: on uhci4 usb5: USB revision 1.0 uhub5: on usb5 uhub5: 2 ports with 2 removable, self powered uhci5: port 0xe500-0xe51f irq 18 at device 29.2 on pci0 uhci5: [GIANT-LOCKED] uhci5: [ITHREAD] usb6: on uhci5 usb6: USB revision 1.0 uhub6: on usb6 uhub6: 2 ports with 2 removable, self powered ehci1: mem 0xe9304000-0xe93043ff irq 23 at device 29.7 on pci0 ehci1: [GIANT-LOCKED] ehci1: [ITHREAD] usb7: EHCI version 1.0 usb7: companion controllers, 2 ports each: usb4 usb5 usb6 usb7: on ehci1 usb7: USB revision 2.0 uhub7: on usb7 uhub7: 6 ports with 6 removable, self powered pcib6: at device 30.0 on pci0 pci6: on pcib6 fwohci0: mem 0xe9204000-0xe92047ff,0xe9200000-0xe9203fff irq 23 at device 7.0 on pci6 fwohci0: [FILTER] fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:2c:a1:59:00:00:1f:d0 fwohci0: Phy 1394a available S400, 3 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:2c:a1:00:1f:d0 fwe0: Ethernet address: 02:2c:a1:00:1f:d0 fwip0: on firewire0 fwip0: Firewire address: 00:2c:a1:59:00:00:1f:d0 @ 0xfffe00000000, S400, maxrec 2048 sbp0: on firewire0 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0xcf2b4000 fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode isab0: at device 31.0 on pci0 isa0: on isab0 atapci1: port 0xe600-0xe607,0xe700-0xe703,0xe800-0xe807,0xe900-0xe903,0xea00-0xea1f mem 0xe9306000-0xe93067ff irq 19 at device 31.2 on pci0 atapci1: [ITHREAD] atapci1: AHCI Version 01.20 controller with 6 ports detected ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] ata6: on atapci1 ata6: [ITHREAD] ata7: on atapci1 ata7: [ITHREAD] ata8: on atapci1 ata8: [ITHREAD] pci0: at device 31.3 (no driver attached) fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: on ppc0 ppbus0: [ITHREAD] plip0: on ppbus0 plip0: WARNING: using obsoleted IFF_NEEDSGIANT flag lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] cpu0: on acpi0 est0: on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 61a492006004920 device_attach: est0 attach returned 6 p4tcc0: on cpu0 cpu1: on acpi0 est1: on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 61a492006004920 device_attach: est1 attach returned 6 p4tcc1: on cpu1 orm0: at iomem 0xd0000-0xd1fff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ukbd0: on uhub2 kbd2 at ukbd0 uhid0: on uhub2 ums0: on uhub2 ums0: 8 buttons and Z dir. Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) ad6: 476938MB at ata3-master SATA300 GEOM_JOURNAL: Journal 3939325718: ad6s1f contains data. GEOM_JOURNAL: Journal 3939325718: ad6s1f contains journal. GEOM_JOURNAL: Journal ad6s1f clean. acd0: DVDR at ata4-master SATA150 pcm0: pcm0: pcm1: pcm1: SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/ad6s1a cryptosoft0: on motherboard GEOM_ELI: Device ad6s1g.eli created. GEOM_ELI: Encryption: Blowfish-CBC 448 GEOM_ELI: Crypto: software GEOM_JOURNAL: Journal 2001271740: ad6s1g.eli contains data. GEOM_JOURNAL: Journal 2001271740: ad6s1g.eli contains journal. GEOM_JOURNAL: Journal ad6s1g.eli clean. GEOM_ELI: Device ad6s1b.eli created. GEOM_ELI: Encryption: AES-CBC 256 GEOM_ELI: Crypto: software 2) As you can see above, GENERIC (here stable, but also occurs on BETA2). 3) loader.conf: acpi_load="YES" acpi_video_load="YES" beastie_disable="YES" geom_journal_load="YES" #smb_load="YES" #smbus_load="YES" #ichsmb_load="YES" snd_hda_load="YES" #aio_load="YES" #kqemu_load="YES" kern.cam.scsi_delay=1000 autoboot_delay=3 linux_load="YES" linprocfs_load="YES" linsysfs_load="YES" 4) device.hints unchanged. > It's been confirmed by numerous people now, including #bsdports users, > that "USB Legacy" does not work for some individuals. This is either > because of BIOS bugs, or because the USB keyboards do not support > tying into SMM. We don't know the true cause. I'm not sure, if every BIOS has got such a setting. I'm not fully sure, if this is a BIOS bug. It could be, of course. Gigabyte has released BIOS firmware updates that are not usable, until one installs Windows (the changes history does not mention any USB fixes though). It will take some time until I can patch the firmware. > One thing we do know: we have FreeBSD users stating they cannot type > in boot0/boot2/loader, even with USB Legacy enabled, so going into > single-user after a reboot is impossible. > > Another thing we do know: we have FreeBSD users who do not have fully > functional USB keyboards in FreeBSD (some see ukbd attach, others do > not; some are using USB4BSD, others are not). Yes. These are 3 different problems. 1) No keyboard in bootloader => missing BIOS USB support. 2) No keyboard after USB controller initialisation => missing quirks? 3) No keyboard spontaneously while working => bug? > So, can someone take the time to come up with test scenarios/cases so > that users can perform these tests, list off the exact hardware they > have, and we can see if there is a consistent/common failure between > everyone? If you need anything more, I can try to deliver the information. I sometimes run out of ideas how to avoid annoying the developers. :) (In other words, I have more problems to report waiting in the queue...) ;) -- Martin From jbondc at openmv.com Wed Nov 12 09:35:54 2008 From: jbondc at openmv.com (jbondc) Date: Wed Nov 12 09:38:42 2008 Subject: Replication system In-Reply-To: <62b856460811031507n2bbfdc14j3bfa7b6006208ff0@mail.gmail.com> References: <490F6EBF.5000102@minibofh.org> <62b856460811031507n2bbfdc14j3bfa7b6006208ff0@mail.gmail.com> Message-ID: <20464994.post@talk.nabble.com> Michael Grant-3 wrote: > > > GlusterFS http://www.gluster.org seems promising. It is a replication > layer that sits on top of FUSE (Filesystem in Userspace > http://fuse.sf.net). You can replicate pretty much any type of file > system, ufs, zfs, dos...etc. In other words, you don't need to > reformat your disk or create some special underlying file system. > GlusterFS is in userspace. > > However.... I have yet to get it working on freebsd. Anyone had any > luck with GlusterFS on Freebsd 6.x? > > The latest client currently segfaults on freebsd 6 I just created this wiki entry: http://www.freebsdwiki.net/index.php/GlusterFS On Freebsd 7 works quite well -- View this message in context: http://www.nabble.com/Replication-system-tp20311697p20464994.html Sent from the freebsd-stable mailing list archive at Nabble.com. From tim-lists at bishnet.net Wed Nov 12 09:58:37 2008 From: tim-lists at bishnet.net (Tim Bishop) Date: Wed Nov 12 09:58:44 2008 Subject: System deadlock when using mksnap_ffs Message-ID: <20081112175826.GD26195@carrick.bishnet.net> I've been playing around with snapshots lately but I've got a problem on one of my servers running 7-STABLE amd64: FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 I run the mksnap_ffs command to take the snapshot and some time later the system completely freezes up: paladin# cd /u2/.snap/ paladin# mksnap_ffs /u2 test.1 It only happens on this one filesystem, though, which might be to do with its size. It's not over the 2TB marker, but it's pretty close. It's also backed by a hardware RAID system, although a smaller filesystem on the same RAID has no issues. Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/da0s1a 2078881084 921821396 990749202 48% /u2 To clarify "completely freezes up": unresponsive to all services over the network, except ping. On the console I can switch between the ttys, but none of them respond. The only way out is to hit the reset button. Any advice? I'm happy to help debug this further to get to the bottom of it. Thanks, Tim. -- Tim Bishop http://www.bishnet.net/tim/ PGP Key: 0x5AE7D984 From des at des.no Wed Nov 12 10:04:37 2008 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Wed Nov 12 10:04:49 2008 Subject: ukbd attachment and root mount In-Reply-To: <491AD7BB.2EAA9AA0@verizon.net> (Sergey Babkin's message of "Wed, 12 Nov 2008 08:18:51 -0500") References: <4911BA93.9030006@icyb.net.ua> <491ABFCD.3060309@icyb.net.ua> <491AC502.9000507@icyb.net.ua> <20081112121410.GA24629@icarus.home.lan> <491ACA19.2040008@icyb.net.ua> <20081112123315.GA24907@icarus.home.lan> <491AD7BB.2EAA9AA0@verizon.net> Message-ID: <86iqqsx2c2.fsf@ds4.des.no> Sergey Babkin writes: > Jeremy Chadwick writes: > > What really needs to happen here should be obvious: we need some > > form of inexpensive keyboard-only USB support in boot2/loader. > If I remember right, UnixWare used(s) the BIOS calls in the loader. So does FreeBSD. DES -- Dag-Erling Sm?rgrav - des@des.no From david at esn.org.za Wed Nov 12 10:11:02 2008 From: david at esn.org.za (David Peall) Date: Wed Nov 12 10:11:10 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112175826.GD26195@carrick.bishnet.net> References: <20081112175826.GD26195@carrick.bishnet.net> Message-ID: > -----Original Message----- > From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd- > stable@freebsd.org] On Behalf Of Tim Bishop > Sent: 12 November 2008 07:58 PM > To: freebsd-stable@freebsd.org > Cc: tim@bishnet.net > Subject: System deadlock when using mksnap_ffs > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 > 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > I run the mksnap_ffs command to take the snapshot and some time later > the system completely freezes up: If the file system is UFS2 it's a known problem but should have been fixed. http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues ident /boot/kernel/kernel | grep subr_sleepqueue version should be greater than 1.39.2.3? Regards -- David Peall :: IT Manager e-Schools' Network :: http://www.esn.org.za/ Phone +27 (021) 674-9140 From tim-lists at bishnet.net Wed Nov 12 10:22:42 2008 From: tim-lists at bishnet.net (Tim Bishop) Date: Wed Nov 12 10:22:49 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: References: <20081112175826.GD26195@carrick.bishnet.net> Message-ID: <20081112182234.GE26195@carrick.bishnet.net> On Wed, Nov 12, 2008 at 08:10:50PM +0200, David Peall wrote: > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 > > 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > I run the mksnap_ffs command to take the snapshot and some time later > > the system completely freezes up: > > If the file system is UFS2 it's a known problem but should have been > fixed. > http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues > > ident /boot/kernel/kernel | grep subr_sleepqueue > > version should be greater than 1.39.2.3? Yes it's UFS2, and yes it's greater than 1.39.2.3: $FreeBSD: src/sys/kern/subr_sleepqueue.c,v 1.39.2.5 2008/09/16 20:01:57 jhb Exp $ Are you sure the problem referenced on that page is the same? It talks about "dog slow" snapshotting, which I see on other filesystems and machines. But in this particular case the system is dead, and does not recover. Tim. -- Tim Bishop http://www.bishnet.net/tim/ PGP Key: 0x5AE7D984 From mi+mill at aldan.algebra.com Wed Nov 12 10:37:30 2008 From: mi+mill at aldan.algebra.com (Mikhail Teterin) Date: Wed Nov 12 10:37:38 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process Message-ID: <491B1BD2.4050903@aldan.algebra.com> Hello! Currently, when a program built without OpenMP (-fopenmp) is trying to dlopen a library, built with the feature, the result is a crash from "bad system call": #0 0x00000008009a223c in ksem_init () from /lib/libc.so.7 #1 0x0000000800998a8f in sem_init () from /lib/libc.so.7 #2 0x00000008011a6537 in omp_get_nested () from /usr/lib/libgomp.so.1 #3 0x00000008011a3466 in ?? () from /usr/lib/libgomp.so.1 #4 0x0000000000000002 in ?? () #5 0x00000008005072b2 in dlsym () from /libexec/ld-elf.so.1 #6 0x0000000800507cd2 in dlopen () from /libexec/ld-elf.so.1 ... Can anything be done about this -- disable the OpenMP functionality, but keep the library usable (single-threaded)? The problem arises, in particular, when one is trying to use libraries built by either GraphicsMagick or ImageMagick ports. Both have an OpenMP option, which speeds up some internal algorithms. The option is off by default, but enabling it makes sense on an SMP system... Yet, this makes the library unsuitable for other purposes... Thanks! Yours, -mi P.S. I'm seeing the crash on a recent FreeBSD-7.1/amd64, but it, likely, can be seen elsewhere. From koitsu at FreeBSD.org Wed Nov 12 10:52:54 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 12 10:53:01 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112182234.GE26195@carrick.bishnet.net> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112182234.GE26195@carrick.bishnet.net> Message-ID: <20081112185251.GA1838@icarus.home.lan> On Wed, Nov 12, 2008 at 06:22:35PM +0000, Tim Bishop wrote: > On Wed, Nov 12, 2008 at 08:10:50PM +0200, David Peall wrote: > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 > > > 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > the system completely freezes up: > > > > If the file system is UFS2 it's a known problem but should have been > > fixed. > > http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues > > > > ident /boot/kernel/kernel | grep subr_sleepqueue > > > > version should be greater than 1.39.2.3? > > Yes it's UFS2, and yes it's greater than 1.39.2.3: > > $FreeBSD: src/sys/kern/subr_sleepqueue.c,v 1.39.2.5 2008/09/16 20:01:57 jhb Exp $ > > Are you sure the problem referenced on that page is the same? It talks > about "dog slow" snapshotting, which I see on other filesystems and > machines. But in this particular case the system is dead, and does not > recover. This problem gets brought up every few weeks on average, I think. The problem still exists regardless of subr_sleepqueue.c 1.39.2.3. I can still reproduce it on every FreeBSD box I have access to. The last time I tried it was on 2008/10/24, on a RELENG_7 system built from source csup'd on the same day. The result of "dump -L -0 -a -f /someplace/fs.dump /usr" was the same: the system became more or less unusable (meaning to the point where you might as well not try to do anything with it because it's so incredibly slow, especially with anything I/O-bound), and mksnap_ffs remained in the following state for many, many minutes: load: 0.00 cmd: mksnap_ffs 10480 [wdrain] 0.00u 0.06s 0% 1076k Hitting ^C at this point took 4-5 *full minutes* to execute. While ^C was (hopefully) executing, the process remained in wdrain state as well. After the process was terminated fully, the system was again responsive. That filesystem (/usr) I was dumping: Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/ad4s1f 163815904 3834316 146876316 3% 254752 20942046 1% /usr -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From lehmann at ans-netz.de Wed Nov 12 11:43:50 2008 From: lehmann at ans-netz.de (Oliver Lehmann) Date: Wed Nov 12 11:43:58 2008 Subject: 3Ware 9000 series hangs under load In-Reply-To: <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> References: <20081029170728.be7cc7ab.lehmann@ans-netz.de> <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> Message-ID: <20081112204351.ccc51c2f.lehmann@ans-netz.de> Philip Murray wrote: > I used to get this (FreeBSD 6.1 days) all the time, the controller > would just lock up almost on a daily basis (and have to wait for the > fsck 4 out of 24 hours in the day). > > Anyway, I stopped running 3dmd (or 3dm2 I think it's called now) to > monitor it, and the crashes went away. It's had hundreds of days > uptime since. > > I've never been game enough to try newer versions of 3dm, but a > cronjob of tw_cli allows me to monitor it now without the lockups. > > Might not be your problem, but it's worth a shot if all else fails. Ok, it realy looks, like 3dm2 is causing the same problems here too. I've tried several 3dm2 versions and beginning with the version released with 9.1.5.2 the system is crashing on high i/o loads. The previous release included in 9.0.1 and 9.0.2 (made in 2004 iirc) is not crashing. Every release which was made later causes system crashes as well. I'll see what the support staff responds to that.... What cronjobs are you running in particlular to "replace" 3dm2? -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ From mi+mill at aldan.algebra.com Wed Nov 12 11:45:55 2008 From: mi+mill at aldan.algebra.com (Mikhail Teterin) Date: Wed Nov 12 11:46:02 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process In-Reply-To: <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> References: <491B1BD2.4050903@aldan.algebra.com> <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> Message-ID: <491B3270.5080402@aldan.algebra.com> Sent by Kostik Belousov: > On Wed, Nov 12, 2008 at 01:09:22PM -0500, Mikhail Teterin wrote: > >> Hello! >> >> Currently, when a program built without OpenMP (-fopenmp) is trying to >> dlopen a library, built with the feature, the result is a crash from >> "bad system call": >> >> #0 0x00000008009a223c in ksem_init () from /lib/libc.so.7 >> #1 0x0000000800998a8f in sem_init () from /lib/libc.so.7 >> #2 0x00000008011a6537 in omp_get_nested () from /usr/lib/libgomp.so.1 >> #3 0x00000008011a3466 in ?? () from /usr/lib/libgomp.so.1 >> #4 0x0000000000000002 in ?? () >> #5 0x00000008005072b2 in dlsym () from /libexec/ld-elf.so.1 >> #6 0x0000000800507cd2 in dlopen () from /libexec/ld-elf.so.1 >> ... >> > Try "kldload sem". > Uhm... That worked... I see... Shouldn't sem_init be nicer about it, though? Thanks, -mi From kostikbel at gmail.com Wed Nov 12 11:47:44 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Nov 12 11:47:51 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112175826.GD26195@carrick.bishnet.net> References: <20081112175826.GD26195@carrick.bishnet.net> Message-ID: <20081112194735.GK47073@deviant.kiev.zoral.com.ua> On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > I've been playing around with snapshots lately but I've got a problem on > one of my servers running 7-STABLE amd64: > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > I run the mksnap_ffs command to take the snapshot and some time later > the system completely freezes up: > > paladin# cd /u2/.snap/ > paladin# mksnap_ffs /u2 test.1 > > It only happens on this one filesystem, though, which might be to do > with its size. It's not over the 2TB marker, but it's pretty close. It's > also backed by a hardware RAID system, although a smaller filesystem on > the same RAID has no issues. > > Filesystem 1K-blocks Used Avail Capacity Mounted on > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > To clarify "completely freezes up": unresponsive to all services over > the network, except ping. On the console I can switch between the ttys, > but none of them respond. The only way out is to hit the reset button. > > Any advice? I'm happy to help debug this further to get to the bottom of > it. You need to provide information described in the http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html and especially http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081112/8b92b2fb/attachment.pgp From tim-lists at bishnet.net Wed Nov 12 11:49:42 2008 From: tim-lists at bishnet.net (Tim Bishop) Date: Wed Nov 12 11:49:50 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112175826.GD26195@carrick.bishnet.net> References: <20081112175826.GD26195@carrick.bishnet.net> Message-ID: <20081112194928.GA19539@carrick.bishnet.net> On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > I run the mksnap_ffs command to take the snapshot and some time later > the system completely freezes up: > > paladin# cd /u2/.snap/ > paladin# mksnap_ffs /u2 test.1 Someone (not named because they choose not to reply to the list) gave me the following patch: --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006 +++ sys/ufs/ffs/ffs_snapshot.c Mon Nov 20 14:59:13 2006 @@ -282,6 +282,8 @@ restart: if (error) goto out; bawrite(nbp); + if (cg % 10 == 0) + ffs_syncvnode(vp, MNT_WAIT); } /* * Copy all the cylinder group maps. Although the @@ -303,6 +305,8 @@ restart: goto out; error = cgaccount(cg, vp, nbp, 1); bawrite(nbp); + if (cg % 10 == 0) + ffs_syncvnode(vp, MNT_WAIT); if (error) goto out; } With the description: "What can happen is on a big file system it will fill up the buffer cache with I/O and then run out. When the buffer cache fills up then no more disk I/O can happen :-( When you do a sync, it flushes that out to disk so things don't hang." It seems to work too. But it seems more like a workaround than a fix? Tim. -- Tim Bishop http://www.bishnet.net/tim/ PGP Key: 0x5AE7D984 From brent at servuhome.net Wed Nov 12 12:00:37 2008 From: brent at servuhome.net (Brent Jones) Date: Wed Nov 12 12:00:44 2008 Subject: Disk top usage PIDs In-Reply-To: <49107CA1.5090309@egr.msu.edu> References: <49107CA1.5090309@egr.msu.edu> Message-ID: On Tue, Nov 4, 2008 at 8:47 AM, Adam McDougall wrote: > Eduardo Meyer wrote: >> >> Hello, >> >> I have some serious issue. Sometimes something happens and my disk >> usage performance find its limit quickly. I follow with gstat and >> iostat -xw1, and everything usually happens just fine, with %b around >> 20 and 0 to 1 pending i/o request. Suddely I get 30, 40 pending >> requests and %b is always on 100% (or more than this). >> >> fstat and lsof gives me no hint, because the type of programs as well >> as the amount of 'em is just the same. >> >> How can I find the PID which is hammering my disk? Is there an "iotop" >> or "disktop" tool or something alike? >> >> Its a mail server. I have pop3, imap, I also have maildrop and >> sometimes, httpd, working around the busiest mount point. >> >> I have also started AUDIT, however all I can get are the top PIDs >> which issue read/write requests. Not the requests which take longer to >> perform (the busiest ones), or should I look for some special audit >> class or event other than open, read and write? >> >> Thank you in advance. >> >> > > top -mio > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > I learn something new everyday on this list...! -- Brent Jones brent@servuhome.net From mi+mill at aldan.algebra.com Wed Nov 12 12:02:29 2008 From: mi+mill at aldan.algebra.com (Mikhail Teterin) Date: Wed Nov 12 12:02:36 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process In-Reply-To: References: <491B1BD2.4050903@aldan.algebra.com> <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> <491B3270.5080402@aldan.algebra.com> Message-ID: <491B3653.5080209@aldan.algebra.com> Sent by Daniel Eischen: > On Wed, 12 Nov 2008, Mikhail Teterin wrote: > >> Sent by Kostik Belousov: >>> On Wed, Nov 12, 2008 at 01:09:22PM -0500, Mikhail Teterin wrote: >>> >>>> Hello! >>>> >>>> Currently, when a program built without OpenMP (-fopenmp) is trying >>>> to dlopen a library, built with the feature, the result is a crash >>>> from "bad system call": >>>> >>>> #0 0x00000008009a223c in ksem_init () from /lib/libc.so.7 >>>> #1 0x0000000800998a8f in sem_init () from /lib/libc.so.7 >> Uhm... That worked... I see... Shouldn't sem_init be nicer about it, >> though? Thanks, > Or perhaps you should read sem(4) ;-) Daniel, what are saying? That it is all my own fault? Generic kernel does not have sem in it... I build a port with an option (OpenMP), that make perfect sense, and try to use it. Software crashes... There is a bug -- and you, instead of contemplating a fix, are telling me to read a man-page? Wow... -mi From deischen at freebsd.org Wed Nov 12 12:12:58 2008 From: deischen at freebsd.org (Daniel Eischen) Date: Wed Nov 12 12:13:05 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process In-Reply-To: <491B3653.5080209@aldan.algebra.com> References: <491B1BD2.4050903@aldan.algebra.com> <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> <491B3270.5080402@aldan.algebra.com> <491B3653.5080209@aldan.algebra.com> Message-ID: On Wed, 12 Nov 2008, Mikhail Teterin wrote: > Sent by Daniel Eischen: >> On Wed, 12 Nov 2008, Mikhail Teterin wrote: >> >>> Sent by Kostik Belousov: >>>> On Wed, Nov 12, 2008 at 01:09:22PM -0500, Mikhail Teterin wrote: >>>> >>>>> Hello! >>>>> >>>>> Currently, when a program built without OpenMP (-fopenmp) is trying to >>>>> dlopen a library, built with the feature, the result is a crash from >>>>> "bad system call": >>>>> >>>>> #0 0x00000008009a223c in ksem_init () from /lib/libc.so.7 >>>>> #1 0x0000000800998a8f in sem_init () from /lib/libc.so.7 >>> Uhm... That worked... I see... Shouldn't sem_init be nicer about it, >>> though? Thanks, >> Or perhaps you should read sem(4) ;-) > Daniel, what are saying? That it is all my own fault? Generic kernel does not > have sem in it... I build a port with an option (OpenMP), that make perfect > sense, and try to use it. Software crashes... > There is a bug -- and you, instead of contemplating a fix, are telling me to > read a man-page? Wow... No, I simply meant that you saw it was returning bad system call from sem_init/ksem_init. A little investigation would have turned up the reason. If you want to debate whether or not P1003_1B_SEMAPHORES should be standard, that is another issue, which I might actually agree with. -- DE From deischen at freebsd.org Wed Nov 12 12:13:53 2008 From: deischen at freebsd.org (Daniel Eischen) Date: Wed Nov 12 12:14:01 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process In-Reply-To: <491B3270.5080402@aldan.algebra.com> References: <491B1BD2.4050903@aldan.algebra.com> <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> <491B3270.5080402@aldan.algebra.com> Message-ID: On Wed, 12 Nov 2008, Mikhail Teterin wrote: > Sent by Kostik Belousov: >> On Wed, Nov 12, 2008 at 01:09:22PM -0500, Mikhail Teterin wrote: >> >>> Hello! >>> >>> Currently, when a program built without OpenMP (-fopenmp) is trying to >>> dlopen a library, built with the feature, the result is a crash from "bad >>> system call": >>> >>> #0 0x00000008009a223c in ksem_init () from /lib/libc.so.7 >>> #1 0x0000000800998a8f in sem_init () from /lib/libc.so.7 >>> #2 0x00000008011a6537 in omp_get_nested () from /usr/lib/libgomp.so.1 >>> #3 0x00000008011a3466 in ?? () from /usr/lib/libgomp.so.1 >>> #4 0x0000000000000002 in ?? () >>> #5 0x00000008005072b2 in dlsym () from /libexec/ld-elf.so.1 >>> #6 0x0000000800507cd2 in dlopen () from /libexec/ld-elf.so.1 >>> ... >>> >> Try "kldload sem". >> > Uhm... That worked... I see... Shouldn't sem_init be nicer about it, though? > Thanks, Or perhaps you should read sem(4) ;-) -- DE From mi+mill at aldan.algebra.com Wed Nov 12 12:22:18 2008 From: mi+mill at aldan.algebra.com (Mikhail Teterin) Date: Wed Nov 12 12:22:27 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process In-Reply-To: References: <491B1BD2.4050903@aldan.algebra.com> <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> <491B3270.5080402@aldan.algebra.com> <491B3653.5080209@aldan.algebra.com> Message-ID: <491B3AF3.9020606@aldan.algebra.com> Sent by Daniel Eischen: > No, I simply meant that you saw it was returning bad system > call from sem_init/ksem_init. Instead, I suspected, that it is the OpenMP, that's at fault. I'm sorry for failing to live up to your expectations of a true FreeBSD user. > A little investigation would have turned up the reason. If you want > to debate whether or > not P1003_1B_SEMAPHORES should be standard, that is another > issue, which I might actually agree with. Well, I'm sure, the debate on including P1003_1B_SEMAPHORES by default has already raged before... No, what I was suggesting, was that sem_init -- not the system call, but the C-function -- should be more intelligent in detecting such situations and reporting them instead of crashing... Either the C-function, or, maybe, the no-op implementation of (k)sem_init in the kernel ought to have told me (even if only in the kernel log), that I need to kldload sem and refer to the man-page. That's what well-integrated Operating Systems do, anyway... -mi From kostikbel at gmail.com Wed Nov 12 12:28:40 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Nov 12 12:28:47 2008 Subject: dlopen-ing a library with OpenMP by a non-OpenMP process In-Reply-To: <491B1BD2.4050903@aldan.algebra.com> References: <491B1BD2.4050903@aldan.algebra.com> Message-ID: <20081112194350.GJ47073@deviant.kiev.zoral.com.ua> On Wed, Nov 12, 2008 at 01:09:22PM -0500, Mikhail Teterin wrote: > Hello! > > Currently, when a program built without OpenMP (-fopenmp) is trying to > dlopen a library, built with the feature, the result is a crash from > "bad system call": > > #0 0x00000008009a223c in ksem_init () from /lib/libc.so.7 > #1 0x0000000800998a8f in sem_init () from /lib/libc.so.7 > #2 0x00000008011a6537 in omp_get_nested () from /usr/lib/libgomp.so.1 > #3 0x00000008011a3466 in ?? () from /usr/lib/libgomp.so.1 > #4 0x0000000000000002 in ?? () > #5 0x00000008005072b2 in dlsym () from /libexec/ld-elf.so.1 > #6 0x0000000800507cd2 in dlopen () from /libexec/ld-elf.so.1 > ... > > Can anything be done about this -- disable the OpenMP functionality, but > keep the library usable (single-threaded)? The problem arises, in > particular, when one is trying to use libraries built by either > GraphicsMagick or ImageMagick ports. Both have an OpenMP option, which > speeds up some internal algorithms. The option is off by default, but > enabling it makes sense on an SMP system... Yet, this makes the library > unsuitable for other purposes... Thanks! Yours, > > -mi > > P.S. I'm seeing the crash on a recent FreeBSD-7.1/amd64, but it, likely, > can be seen elsewhere. Try "kldload sem". -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081112/62fb2f76/attachment.pgp From pmurray at nevada.net.nz Wed Nov 12 12:37:07 2008 From: pmurray at nevada.net.nz (Philip Murray) Date: Wed Nov 12 12:37:15 2008 Subject: 3Ware 9000 series hangs under load In-Reply-To: <20081112204351.ccc51c2f.lehmann@ans-netz.de> References: <20081029170728.be7cc7ab.lehmann@ans-netz.de> <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> <20081112204351.ccc51c2f.lehmann@ans-netz.de> Message-ID: <95E9EA2C-C288-4F11-AD35-FE6AF6633A09@nevada.net.nz> On 13/11/2008, at 8:43 AM, Oliver Lehmann wrote: > Philip Murray wrote: > >> I used to get this (FreeBSD 6.1 days) all the time, the controller >> would just lock up almost on a daily basis (and have to wait for the >> fsck 4 out of 24 hours in the day). >> >> Anyway, I stopped running 3dmd (or 3dm2 I think it's called now) to >> monitor it, and the crashes went away. It's had hundreds of days >> uptime since. >> >> I've never been game enough to try newer versions of 3dm, but a >> cronjob of tw_cli allows me to monitor it now without the lockups. >> >> Might not be your problem, but it's worth a shot if all else fails. > > Ok, it realy looks, like 3dm2 is causing the same problems here too. > > I've tried several 3dm2 versions and beginning with the version > released > with 9.1.5.2 the system is crashing on high i/o loads. The previous > release included in 9.0.1 and 9.0.2 (made in 2004 iirc) is not > crashing. > Every release which was made later causes system crashes as well. > > I'll see what the support staff responds to that.... > > What cronjobs are you running in particlular to "replace" 3dm2? > I just installed sysutils/tw_cli from ports, and it sets up some 'periodic' scripts for you. To be precise it puts 407.status-3ware- raid in /usr/local/etc/periodic/daily Cheers Phil From kostikbel at gmail.com Wed Nov 12 13:05:21 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Nov 12 13:05:28 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112194928.GA19539@carrick.bishnet.net> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194928.GA19539@carrick.bishnet.net> Message-ID: <20081112210513.GM47073@deviant.kiev.zoral.com.ua> On Wed, Nov 12, 2008 at 07:49:28PM +0000, Tim Bishop wrote: > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > I run the mksnap_ffs command to take the snapshot and some time later > > the system completely freezes up: > > > > paladin# cd /u2/.snap/ > > paladin# mksnap_ffs /u2 test.1 > > Someone (not named because they choose not to reply to the list) gave me > the following patch: > > --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006 > +++ sys/ufs/ffs/ffs_snapshot.c Mon Nov 20 14:59:13 2006 > @@ -282,6 +282,8 @@ restart: > if (error) > goto out; > bawrite(nbp); > + if (cg % 10 == 0) > + ffs_syncvnode(vp, MNT_WAIT); > } > /* > * Copy all the cylinder group maps. Although the > @@ -303,6 +305,8 @@ restart: > goto out; > error = cgaccount(cg, vp, nbp, 1); > bawrite(nbp); > + if (cg % 10 == 0) > + ffs_syncvnode(vp, MNT_WAIT); > if (error) > goto out; > } > > With the description: > > "What can happen is on a big file system it will fill up the buffer > cache with I/O and then run out. When the buffer cache fills up then no > more disk I/O can happen :-( When you do a sync, it flushes that out to > disk so things don't hang." > > It seems to work too. But it seems more like a workaround than a fix? It looks hackish, but in fact it is not that wrong, and I even say that it provides reasonable workaround. The usual way to prevent wdrain deadlock is to issue bwillwrite() call before any vnode lock is taken. This is sufficient for most VFS syscalls that typically put dozen or less dirty buffers into delayed write queue. Snapshot creation does not call bwillwrite() at all, and then does a lot of async writes, completely saturating buffer cache with dirty buffers. bwillwrite cannot be called after the vnode is locked, and just forcing a sync for the embrionic snapshot vnode is good enough. The 10 counter is debatable, but debate shall be postponed until the patch goes into tree. I ask an anonymous submitter to commit it. Thanks ! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081112/df5e9b1d/attachment.pgp From dougb at FreeBSD.org Wed Nov 12 14:43:02 2008 From: dougb at FreeBSD.org (Doug Barton) Date: Wed Nov 12 14:43:08 2008 Subject: RELENG_7 build failure in rescue/ (iconv?) Message-ID: <491B5BF3.1010309@FreeBSD.org> I'm getting the following with clean, up to date sources (with the BIND 9.4.2-P2 update, but that doesn't touch this area): cc -static -o rescue rescue.o cat.lo chflags.lo chio.lo chmod.lo cp.lo date.lo dd.lo df.lo echo.lo ed.lo expr.lo getfacl.lo hostname.lo kenv.lo kill.lo ln.lo ls.lo mkdir.lo mv.lo pax.lo ps.lo pwd.lo realpath.lo rm.lo rmdir.lo setfacl.lo sh.lo stty.lo sync.lo test.lo rcp.lo csh.lo atacontrol.lo badsect.lo bsdlabel.lo camcontrol.lo ccdconfig.lo clri.lo devfs.lo dmesg.lo dump.lo dumpfs.lo dumpon.lo fsck.lo fsck_ffs.lo fsck_msdosfs.lo fsdb.lo fsirand.lo gbde.lo ifconfig.lo init.lo kldconfig.lo kldload.lo kldstat.lo kldunload.lo ldconfig.lo md5.lo mdconfig.lo mdmfs.lo mknod.lo mount.lo mount_cd9660.lo mount_msdosfs.lo mount_nfs.lo mount_ntfs.lo mount_nullfs.lo mount_udf.lo mount_unionfs.lo newfs.lo newfs_msdos.lo nos-tun.lo ping.lo reboot.lo restore.lo rcorder.lo route.lo routed.lo rtquery.lo rtsol.lo savecore.lo slattach.lo spppcontrol.lo startslip.lo swapon.lo sysctl.lo tunefs.lo umount.lo atmconfig.lo ping6.lo ipf.lo sconfig.lo fdisk.lo dhclient.lo gzip.lo bzip2.lo tar.lo vi.lo id.lo chroot.lo /dumpster/home/dougb/src/rescue/rescue/../librescue/exec.o /dumpster/home/dougb/src/rescue/rescue/../librescue/getusershell.o /dumpster/home/dougb/src/rescue/rescue/../librescue/login_class.o /dumpster/home/dougb/src/rescue/rescue/../librescue/popen.o /dumpster/home/dougb/src/rescue/rescue/../librescue/rcmdsh.o /dumpster/home/dougb/src/rescue/rescue/../librescue/sysctl.o /dumpster/home/dougb/src/rescue/rescue/../librescue/system.o -lcrypt -ledit -lkvm -ll -lm -ltermcap -lutil -lcrypto -lalias -lcam -lcurses -ldevstat -lipsec -lipx -lgeom -lbsdxml -lkiconv -lmd -lreadline -lsbuf -lufs -lz -lbz2 -larchive csh.lo(.text+0xd4c9): In function `nlsclose': : undefined reference to `dl_iconv_close' csh.lo(.text+0xd626): In function `nlsinit': : undefined reference to `dl_iconv_open' csh.lo(.text+0xd716): In function `iconv_catgets': : undefined reference to `dl_iconv' *** Error code 1 Stop in /dumpster/home/dougb/src/rescue/rescue. *** Error code 1 -- This .signature sanitized for your protection From tim-lists at bishnet.net Wed Nov 12 16:41:11 2008 From: tim-lists at bishnet.net (Tim Bishop) Date: Wed Nov 12 16:41:18 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112194735.GK47073@deviant.kiev.zoral.com.ua> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> Message-ID: <20081113004102.GD24360@carrick.bishnet.net> On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > I've been playing around with snapshots lately but I've got a problem on > > one of my servers running 7-STABLE amd64: > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > I run the mksnap_ffs command to take the snapshot and some time later > > the system completely freezes up: > > > > paladin# cd /u2/.snap/ > > paladin# mksnap_ffs /u2 test.1 > > > > It only happens on this one filesystem, though, which might be to do > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > also backed by a hardware RAID system, although a smaller filesystem on > > the same RAID has no issues. > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > To clarify "completely freezes up": unresponsive to all services over > > the network, except ping. On the console I can switch between the ttys, > > but none of them respond. The only way out is to hit the reset button. > > You need to provide information described in the > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > and especially > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html Ok, I've done that, and removed the patch that seemed to fix things. The first thing I notice after doing this on the console is that I can still ctrl+t the process: load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k But the top and ps I left running on other ttys have all stopped responding. Also the following kernel message came out: Expensive timeout(9) function: 0xffffffff802ce380(0xffffff000677ca50) 0.006121001 s There is also still some disk I/O. Dropping to ddb worked, but I don't have a serial console so I can't paste the output. ps shows mksnap_ffs in newbuf, as we already saw. A trace of mksnap_ffs looks like this: Tracing pid 2603 tid 100214 td 0xffffff0006a0e370 sched_switch() at sched_switch+0x2a1 mi_switch() at mi_switch+0x233 sleepq_switch() at sleepq_switch+0xe9 sleepq_wait() at sleepq_wait+0x44 _sleep() at _sleep+0x351 getnewbuf() at getnewbuf+0x2e1 getblk() at getblk+0x30d setup_allocindir_phase2() at setup_allocindir_phase2+0x338 softdep_setup_allocindir_page() at softdep_setup_allocindir_page+0xa7 ffs_balloc_ufs2() at ffs_balloc_ufs2+0x121e ffs_snapshot() at ffs_snapshot+0xc52 ffs_mount() at ffs_mount+0x735 vfs_donmount() at vfs_donmount+0xeb5 kernel_mount() at kernel_mount+0xa1 ffs_cmount() at ffs_cmount+0x92 mount() at mount+0x1cc syscall() at syscall+0x1f6 Xfast_syscall() at Xfast_syscall+0xab --- syscall (21, FreeBSD ELF64, mount), rip = 0x80068636c, rsp = 0x7fffffffe518, rbp = 0x8008447a0 --- show pcpu shows cpuid 3 (quad core machine) in thread "swi6: Giant taskq". All the other cpus are idle. show locks shows: exclusive sleep mutex Giant r = 0 (0xffffffff806ae040) locked @ /usr/src/sys/kern/kern_intr.c:1087 There are two other locks shown by show all locks, one for sshd and one for mysqld, both in kern/uipc_sockbuf.c. show lockedvnods shows mksnap_ffs has a lock on da0s1a with ffs_vget at the top of the stack. Sorry for any typos. I'll sort out a serial cable if more is needed :-) Tim. -- Tim Bishop http://www.bishnet.net/tim/ PGP Key: 0x5AE7D984 From koitsu at FreeBSD.org Wed Nov 12 20:42:03 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 12 20:42:09 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113004102.GD24360@carrick.bishnet.net> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> Message-ID: <20081113044200.GA10419@icarus.home.lan> On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > I've been playing around with snapshots lately but I've got a problem on > > > one of my servers running 7-STABLE amd64: > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > the system completely freezes up: > > > > > > paladin# cd /u2/.snap/ > > > paladin# mksnap_ffs /u2 test.1 > > > > > > It only happens on this one filesystem, though, which might be to do > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > also backed by a hardware RAID system, although a smaller filesystem on > > > the same RAID has no issues. > > > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > > > To clarify "completely freezes up": unresponsive to all services over > > > the network, except ping. On the console I can switch between the ttys, > > > but none of them respond. The only way out is to hit the reset button. > > > > You need to provide information described in the > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > and especially > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > Ok, I've done that, and removed the patch that seemed to fix things. > > The first thing I notice after doing this on the console is that I can > still ctrl+t the process: > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > But the top and ps I left running on other ttys have all stopped > responding. Then in my book, the patch didn't fix anything. :-) The system is still "deadlocking"; snapshot generation **should not** wedge the system hard like this. Also, during my own testing, I am always able to use Ctrl-T to get SIGINFO from the running process (mksnap_ffs). That behaviour does not change for me. The rest of the below information is good -- but I'm confused about something: is there anyone out there who can use mksnap_ffs on a filesystem (/usr is a good test source) and NOT experience this deadlocking problem? Literally *every* FreeBSD box I have root access to suffers from this problem, so I'm a little baffled why we end-users need to keep providing debugging output when it should be easy as pie for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch their system wedge. Also, a fellow on -fs just mentioned he's having this exact problem: http://lists.freebsd.org/pipermail/freebsd-fs/2008-November/005324.html > Also the following kernel message came out: > > Expensive timeout(9) function: 0xffffffff802ce380(0xffffff000677ca50) 0.006121001 s > > There is also still some disk I/O. > > Dropping to ddb worked, but I don't have a serial console so I can't > paste the output. > > ps shows mksnap_ffs in newbuf, as we already saw. A trace of mksnap_ffs > looks like this: > > Tracing pid 2603 tid 100214 td 0xffffff0006a0e370 > sched_switch() at sched_switch+0x2a1 > mi_switch() at mi_switch+0x233 > sleepq_switch() at sleepq_switch+0xe9 > sleepq_wait() at sleepq_wait+0x44 > _sleep() at _sleep+0x351 > getnewbuf() at getnewbuf+0x2e1 > getblk() at getblk+0x30d > setup_allocindir_phase2() at setup_allocindir_phase2+0x338 > softdep_setup_allocindir_page() at softdep_setup_allocindir_page+0xa7 > ffs_balloc_ufs2() at ffs_balloc_ufs2+0x121e > ffs_snapshot() at ffs_snapshot+0xc52 > ffs_mount() at ffs_mount+0x735 > vfs_donmount() at vfs_donmount+0xeb5 > kernel_mount() at kernel_mount+0xa1 > ffs_cmount() at ffs_cmount+0x92 > mount() at mount+0x1cc > syscall() at syscall+0x1f6 > Xfast_syscall() at Xfast_syscall+0xab > --- syscall (21, FreeBSD ELF64, mount), rip = 0x80068636c, rsp = 0x7fffffffe518, rbp = 0x8008447a0 --- > > show pcpu shows cpuid 3 (quad core machine) in thread "swi6: Giant taskq". > All the other cpus are idle. > > show locks shows: > > exclusive sleep mutex Giant r = 0 (0xffffffff806ae040) locked @ /usr/src/sys/kern/kern_intr.c:1087 > > There are two other locks shown by show all locks, one for sshd and one > for mysqld, both in kern/uipc_sockbuf.c. > > show lockedvnods shows mksnap_ffs has a lock on da0s1a with ffs_vget at > the top of the stack. > > Sorry for any typos. I'll sort out a serial cable if more is needed :-) > > Tim. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From ambrisko at ambrisko.com Wed Nov 12 21:00:53 2008 From: ambrisko at ambrisko.com (Doug Ambrisko) Date: Wed Nov 12 21:01:01 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113044200.GA10419@icarus.home.lan> Message-ID: <200811130500.mAD50rqt051930@ambrisko.com> Jeremy Chadwick writes: [snip] | The rest of the below information is good -- but I'm confused about | something: is there anyone out there who can use mksnap_ffs on a | filesystem (/usr is a good test source) and NOT experience this | deadlocking problem? Literally *every* FreeBSD box I have root access | to suffers from this problem, so I'm a little baffled why we end-users | need to keep providing debugging output when it should be easy as pie | for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch | their system wedge. We can at work, but we have a bunch of other patches. There are a few problems with the buffer cache: 1) The buffer daemon can't use the space that is reserved for it since to flush some stuff it needs to use more buffers. 2) The buffer cache can get fragmented to prevent large I/O which the buffer daemon may need. 3) Other issues ... I have fix for "1". It is pretty easy. I have a hack'ish fix for "2" in the I make all request use max size so it can't get fragmented since there is no code to defrag and it isn't trivial to defrag the memory. I have some fixes for some other issues, but there were some review issues with them. I might just commit the fixes for 1 and 2. It makes things better and there was no-objections at the time. We have the patches in shipping products. I can try to do some experiments at work like you said since I had similar things working before and it is pretty easy to put in printf's to see the issue. Doug A. From david at catwhisker.org Wed Nov 12 21:02:59 2008 From: david at catwhisker.org (David Wolfskill) Date: Wed Nov 12 21:03:06 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113044200.GA10419@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> Message-ID: <20081113050250.GR69155@bunrab.catwhisker.org> On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > ... > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > I've been playing around with snapshots lately but I've got a problem on > > > > one of my servers running 7-STABLE amd64: > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > the system completely freezes up: > > > > > > > > paladin# cd /u2/.snap/ > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > the same RAID has no issues. > ... > Then in my book, the patch didn't fix anything. :-) The system is > still "deadlocking"; snapshot generation **should not** wedge the system > hard like this. > > Also, during my own testing, I am always able to use Ctrl-T to get > SIGINFO from the running process (mksnap_ffs). That behaviour does not > change for me. > > The rest of the below information is good -- but I'm confused about > something: is there anyone out there who can use mksnap_ffs on a > filesystem (/usr is a good test source) and NOT experience this > deadlocking problem? I hadn't ever tried until I saw your message. Granted, I'm using a smaller file system (I doubt that I have a toital of as much as 2 TB in all my machines combined), and I'm running i386, vs. amd64. But it ran just fine. I wasn't able to test SIGINFO; it finished before I had a chance. (I ran it under time(1); wall clock time was 0.91 sec.) > Literally *every* FreeBSD box I have root access > to suffers from this problem, so I'm a little baffled why we end-users > need to keep providing debugging output when it should be easy as pie > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > their system wedge. Well, I routinely use dump/restore pipelines to copy file systems around; never had a problem with it. > ... For reference: freebeast(7.1-P)[9] uname -a FreeBSD freebeast.catwhisker.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #127: Wed Nov 12 05:16:20 PST 2008 root@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/FREEBEAST i386 freebeast(7.1-P)[10] ls -la total 4 drwxrwxr-x 2 root operator 512 Nov 12 20:53 . drwxr-xr-x 14 root wheel 512 Jan 22 2008 .. freebeast(7.1-P)[11] /usr/bin/time -l mksnap_ffs /S2/usr test.1 0.91 real 0.00 user 0.05 sys 976 maximum resident set size 3 average shared memory size 627 average unshared data size 109 average unshared stack size 104 page reclaims 0 page faults 0 swaps 1 block input operations 230 block output operations 0 messages sent 0 messages received 0 signals received 101 voluntary context switches 34 involuntary context switches freebeast(7.1-P)[12] ls -la total 1460 drwxrwxr-x 2 root operator 512 Nov 12 20:54 . drwxr-xr-x 14 root wheel 512 Jan 22 2008 .. -r--r----- 1 root operator 2410791056 Nov 12 20:54 test.1 freebeast(7.1-P)[13] Peace, david -- David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081113/c43ab60a/attachment.pgp From ambrisko at ambrisko.com Wed Nov 12 21:16:54 2008 From: ambrisko at ambrisko.com (Doug Ambrisko) Date: Wed Nov 12 21:17:02 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081112210513.GM47073@deviant.kiev.zoral.com.ua> Message-ID: <200811130447.mAD4lbJG051137@ambrisko.com> Kostik Belousov writes: | On Wed, Nov 12, 2008 at 07:49:28PM +0000, Tim Bishop wrote: | > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: | > > I run the mksnap_ffs command to take the snapshot and some time later | > > the system completely freezes up: | > > | > > paladin# cd /u2/.snap/ | > > paladin# mksnap_ffs /u2 test.1 | > | > Someone (not named because they choose not to reply to the list) gave me | > the following patch: | > | > --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006 | > +++ sys/ufs/ffs/ffs_snapshot.c Mon Nov 20 14:59:13 2006 | > @@ -282,6 +282,8 @@ restart: | > if (error) | > goto out; | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > } | > /* | > * Copy all the cylinder group maps. Although the | > @@ -303,6 +305,8 @@ restart: | > goto out; | > error = cgaccount(cg, vp, nbp, 1); | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > if (error) | > goto out; | > } | > | > With the description: | > | > "What can happen is on a big file system it will fill up the buffer | > cache with I/O and then run out. When the buffer cache fills up then no | > more disk I/O can happen :-( When you do a sync, it flushes that out to | > disk so things don't hang." | > | > It seems to work too. But it seems more like a workaround than a fix? | | It looks hackish, but in fact it is not that wrong, and I even say that | it provides reasonable workaround. | | The usual way to prevent wdrain deadlock is to issue bwillwrite() call | before any vnode lock is taken. This is sufficient for most VFS syscalls | that typically put dozen or less dirty buffers into delayed write | queue. | | Snapshot creation does not call bwillwrite() at all, and then does a lot | of async writes, completely saturating buffer cache with dirty buffers. | bwillwrite cannot be called after the vnode is locked, and just forcing | a sync for the embrionic snapshot vnode is good enough. | | The 10 counter is debatable, but debate shall be postponed until the patch | goes into tree. I ask an anonymous submitter to commit it. Thanks ! I plan to commit it tomorrow since I sent it to Tim to test. The 10 can be tuned but it has kept a bunch of machines at work up. Glad people don't think it is that it is to wrong :-) It probably could be made a little more dynamic but I wonder if it would show any real performance difference and might risk more bugs. Doug A. From koitsu at FreeBSD.org Wed Nov 12 22:05:24 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Wed Nov 12 22:05:32 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113050250.GR69155@bunrab.catwhisker.org> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> <20081113050250.GR69155@bunrab.catwhisker.org> Message-ID: <20081113060521.GA11595@icarus.home.lan> On Wed, Nov 12, 2008 at 09:02:50PM -0800, David Wolfskill wrote: > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > > ... > > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > > I've been playing around with snapshots lately but I've got a problem on > > > > > one of my servers running 7-STABLE amd64: > > > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > > the system completely freezes up: > > > > > > > > > > paladin# cd /u2/.snap/ > > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > > the same RAID has no issues. > > ... > > Then in my book, the patch didn't fix anything. :-) The system is > > still "deadlocking"; snapshot generation **should not** wedge the system > > hard like this. > > > > Also, during my own testing, I am always able to use Ctrl-T to get > > SIGINFO from the running process (mksnap_ffs). That behaviour does not > > change for me. > > > > The rest of the below information is good -- but I'm confused about > > something: is there anyone out there who can use mksnap_ffs on a > > filesystem (/usr is a good test source) and NOT experience this > > deadlocking problem? > > I hadn't ever tried until I saw your message. Granted, I'm using a > smaller file system (I doubt that I have a toital of as much as 2 TB in > all my machines combined), and I'm running i386, vs. amd64. But it ran > just fine. I wasn't able to test SIGINFO; it finished before I had a > chance. (I ran it under time(1); wall clock time was 0.91 sec.) > > > Literally *every* FreeBSD box I have root access > > to suffers from this problem, so I'm a little baffled why we end-users > > need to keep providing debugging output when it should be easy as pie > > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > > their system wedge. > > Well, I routinely use dump/restore pipelines to copy file systems > around; never had a problem with it. > > > ... > > For reference: > > freebeast(7.1-P)[9] uname -a > FreeBSD freebeast.catwhisker.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #127: Wed Nov 12 05:16:20 PST 2008 root@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/FREEBEAST i386 > freebeast(7.1-P)[10] ls -la > total 4 > drwxrwxr-x 2 root operator 512 Nov 12 20:53 . > drwxr-xr-x 14 root wheel 512 Jan 22 2008 .. > freebeast(7.1-P)[11] /usr/bin/time -l mksnap_ffs /S2/usr test.1 > 0.91 real 0.00 user 0.05 sys > 976 maximum resident set size > 3 average shared memory size > 627 average unshared data size > 109 average unshared stack size > 104 page reclaims > 0 page faults > 0 swaps > 1 block input operations > 230 block output operations > 0 messages sent > 0 messages received > 0 signals received > 101 voluntary context switches > 34 involuntary context switches > freebeast(7.1-P)[12] ls -la > total 1460 > drwxrwxr-x 2 root operator 512 Nov 12 20:54 . > drwxr-xr-x 14 root wheel 512 Jan 22 2008 .. > -r--r----- 1 root operator 2410791056 Nov 12 20:54 test.1 > freebeast(7.1-P)[13] David, thanks for chiming in. This is exactly what I was fearing/worried about. It would be greatly beneficial if we could figure out what triggers the slowdown for a lot of us, since for others (proof above) mksnap_ffs behaves as expected. Since I'm able to reproduce this pretty much everywhere, here's information: # df -ki /usr Filesystem 1024-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/ad4s1f 163815904 3835274 146875358 3% 254864 20941934 1% /usr # cd /usr/.snap # /usr/bin/time -l mksnap_ffs /usr test.1 load: 1.90 cmd: mksnap_ffs 11719 [wdrain] 0.00u 0.07s 0% 1092k 23.25 real 0.00 user 0.00 sys 135.98 real 0.00 user 0.62 sys 1092 maximum resident set size 4 average shared memory size 1081 average unshared data size 135 average unshared stack size 101 page reclaims 0 page faults 0 swaps 895 block input operations 13444 block output operations 0 messages sent 0 messages received 0 signals received 6433 voluntary context switches 197 involuntary context switches # ls -l test.1 -r--r----- 1 root operator 173203463240 Nov 12 21:42 test.1 David's filesystem is 2GBs, while mine is 16GB. His snap takes under 1 second, yet mine takes over 2 minutes. Possibly the large deviation is explained by the amount of space used on the filesystem or the number of inodes in use? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From wb at freebie.xs4all.nl Wed Nov 12 22:53:06 2008 From: wb at freebie.xs4all.nl (Wilko Bulte) Date: Wed Nov 12 22:53:23 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113044200.GA10419@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> Message-ID: <20081113065300.GA1276@freebie.xs4all.nl> Quoting Jeremy Chadwick, who wrote on Wed, Nov 12, 2008 at 08:42:00PM -0800 .. > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > I've been playing around with snapshots lately but I've got a problem on > > > > one of my servers running 7-STABLE amd64: > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > the system completely freezes up: > > > > > > > > paladin# cd /u2/.snap/ > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > the same RAID has no issues. > > > > > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > > > > > To clarify "completely freezes up": unresponsive to all services over > > > > the network, except ping. On the console I can switch between the ttys, > > > > but none of them respond. The only way out is to hit the reset button. > > > > > > You need to provide information described in the > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > > and especially > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > > Ok, I've done that, and removed the patch that seemed to fix things. > > > > The first thing I notice after doing this on the console is that I can > > still ctrl+t the process: > > > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > > > But the top and ps I left running on other ttys have all stopped > > responding. > > Then in my book, the patch didn't fix anything. :-) The system is > still "deadlocking"; snapshot generation **should not** wedge the system > hard like this. > > Also, during my own testing, I am always able to use Ctrl-T to get > SIGINFO from the running process (mksnap_ffs). That behaviour does not > change for me. > > The rest of the below information is good -- but I'm confused about > something: is there anyone out there who can use mksnap_ffs on a > filesystem (/usr is a good test source) and NOT experience this > deadlocking problem? Literally *every* FreeBSD box I have root access > to suffers from this problem, so I'm a little baffled why we end-users > need to keep providing debugging output when it should be easy as pie > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > their system wedge. dump -L on my RELENG_7 machine does not wedge it. So there must be multiple factors influencing the snap creating problems or not. Wilko From peterjeremy at optushome.com.au Wed Nov 12 23:06:55 2008 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Wed Nov 12 23:07:02 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <200811130447.mAD4lbJG051137@ambrisko.com> References: <20081112210513.GM47073@deviant.kiev.zoral.com.ua> <200811130447.mAD4lbJG051137@ambrisko.com> Message-ID: <20081113070648.GC1098@server.vk2pj.dyndns.org> On 2008-Nov-12 20:47:37 -0800, Doug Ambrisko wrote: >I plan to commit it tomorrow since I sent it to Tim to test. The 10 can >be tuned but it has kept a bunch of machines at work up. Glad people >don't think it is that it is to wrong :-) It probably could be made >a little more dynamic but I wonder if it would show any real performance >difference and might risk more bugs. FWIW, I've been running the patch since I first saw Doug post it in Feb 2006 and don't recall ever having problems with mksnap_ffs since applying it (I did before) -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081113/b22b5703/attachment.pgp From toasty at dragondata.com Wed Nov 12 23:54:01 2008 From: toasty at dragondata.com (Kevin Day) Date: Wed Nov 12 23:54:08 2008 Subject: System deadlock when using mksnap_ffs Message-ID: (moving my thread from -fs to -stable) Before touching anything, here's a description of the symptoms I see... Rather busy system, with quite a bit of filesystem activity occurring while the snapshot is being made. Quad CPU amd64 box with 16GB of ram, 6x10Krpm RAID array. Should be reasonably fast. Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/da0s1a 739339824 74357926 605834714 11% 1718540 93855474 2% / 1.7 million inodes, 71G used of a 705G volume. Here's a timeline of what I see when starting to make a new snapshot. I've got a few windows running, showing "top", "iostat", etc. Baseline disk activity before starting anything: device r/s w/s kr/s kw/s wait svc_t b da0 24.0 2.0 355.6 32.0 1 10.7 28 0m0s: Snapshot begins, using "mount -u -o snapshot //.snap/weekly. 0 /" Drives immediately jump to 100% busy as expected. device r/s w/s kr/s kw/s wait svc_t b da0 153.8 6.0 3378.6 95.9 2 16.9 100 the mount process is spending 100% of its time in "biord". 2m10s: The mount process starts spending more and more time in "snaplk", alternating with "biord". device r/s w/s kr/s kw/s wait svc_t b da0 77.9 67.9 1270.7 3754.2 1 10.7 100 12m15s: The first intermittent slowdowns start affecting other processes on the system. Occasionally all active processes will get stuck in "snaplk" or "ufs" for 5-10 seconds before resuming. device r/s w/s kr/s kw/s wait svc_t b da0 77.9 31.0 1150.8 1054.9 1 10.4 100 114m47s: Active processes are briefly stuck in "suspfs" 115m22s: Mount is now in "snaprdb", Active processes are now completely stuck in "snaplk". Still responsive to SIGINFO, top is still running, etc. Just hangs any time anything needs the filesystem. device r/s w/s kr/s kw/s wait svc_t b da0 238.8 0.0 3820.1 0.0 1 4.1 99 143m19s: Mount now in wdrain. 143m34s: Finished. snapshot logging shows "/: suspended 13.308 sec, redo 153 of 4058" Most processes were hung for 28 minutes. Is this what others are seeing? It sounds like some of the complaints are it getting stuck in the "wdrain" state, not what I'm showing here. Another mildly annoying note: Any process that touches ".snap" while a snapshot is being generated gets stuck in "ufs" until it finishes. I can understand wanting to keep operations in there in sync, but it would be really nice if "find /" wouldn't get hung when it tries to decent into .snap, for example. ts5# cd /.snap ts5# ls -l ^T load: 0.17 cmd: ls 3696 [ufs] 0.00u 0.00s 0% 1496k From tim-lists at bishnet.net Thu Nov 13 00:35:12 2008 From: tim-lists at bishnet.net (Tim Bishop) Date: Thu Nov 13 00:35:19 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113044200.GA10419@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> Message-ID: <20081113083504.GF24360@carrick.bishnet.net> Jeremy, On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > the system completely freezes up: > > > > > > > > paladin# cd /u2/.snap/ > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > You need to provide information described in the > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > > and especially > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > > Ok, I've done that, and removed the patch that seemed to fix things. > > > > The first thing I notice after doing this on the console is that I can > > still ctrl+t the process: > > > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > > > But the top and ps I left running on other ttys have all stopped > > responding. > > Then in my book, the patch didn't fix anything. :-) The system is > still "deadlocking"; snapshot generation **should not** wedge the system > hard like this. You missed the part where I said I "removed the patch". I did that so I could provide details with it wedged. I agree that there's still some fundamental speed issues with snapshotting though. And I'm sure the FS itself will still be locked out for a while during the snapshot. But with the patch at least the whole thing doesn't lock up. Tim. -- Tim Bishop http://www.bishnet.net/tim/ PGP Key: 0x5AE7D984 From koitsu at FreeBSD.org Thu Nov 13 01:15:52 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Thu Nov 13 01:16:00 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113060521.GA11595@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> <20081113050250.GR69155@bunrab.catwhisker.org> <20081113060521.GA11595@icarus.home.lan> Message-ID: <20081113091549.GA15888@icarus.home.lan> On Wed, Nov 12, 2008 at 10:05:21PM -0800, Jeremy Chadwick wrote: > On Wed, Nov 12, 2008 at 09:02:50PM -0800, David Wolfskill wrote: > > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > > > ... > > > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > > > I've been playing around with snapshots lately but I've got a problem on > > > > > > one of my servers running 7-STABLE amd64: > > > > > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > > > the system completely freezes up: > > > > > > > > > > > > paladin# cd /u2/.snap/ > > > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > > > the same RAID has no issues. > > > ... > > > Then in my book, the patch didn't fix anything. :-) The system is > > > still "deadlocking"; snapshot generation **should not** wedge the system > > > hard like this. > > > > > > Also, during my own testing, I am always able to use Ctrl-T to get > > > SIGINFO from the running process (mksnap_ffs). That behaviour does not > > > change for me. > > > > > > The rest of the below information is good -- but I'm confused about > > > something: is there anyone out there who can use mksnap_ffs on a > > > filesystem (/usr is a good test source) and NOT experience this > > > deadlocking problem? > > > > I hadn't ever tried until I saw your message. Granted, I'm using a > > smaller file system (I doubt that I have a toital of as much as 2 TB in > > all my machines combined), and I'm running i386, vs. amd64. But it ran > > just fine. I wasn't able to test SIGINFO; it finished before I had a > > chance. (I ran it under time(1); wall clock time was 0.91 sec.) > > > > > Literally *every* FreeBSD box I have root access > > > to suffers from this problem, so I'm a little baffled why we end-users > > > need to keep providing debugging output when it should be easy as pie > > > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > > > their system wedge. > > > > Well, I routinely use dump/restore pipelines to copy file systems > > around; never had a problem with it. > > > > > ... > > > > For reference: > > > > freebeast(7.1-P)[9] uname -a > > FreeBSD freebeast.catwhisker.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #127: Wed Nov 12 05:16:20 PST 2008 root@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/FREEBEAST i386 > > freebeast(7.1-P)[10] ls -la > > total 4 > > drwxrwxr-x 2 root operator 512 Nov 12 20:53 . > > drwxr-xr-x 14 root wheel 512 Jan 22 2008 .. > > freebeast(7.1-P)[11] /usr/bin/time -l mksnap_ffs /S2/usr test.1 > > 0.91 real 0.00 user 0.05 sys > > 976 maximum resident set size > > 3 average shared memory size > > 627 average unshared data size > > 109 average unshared stack size > > 104 page reclaims > > 0 page faults > > 0 swaps > > 1 block input operations > > 230 block output operations > > 0 messages sent > > 0 messages received > > 0 signals received > > 101 voluntary context switches > > 34 involuntary context switches > > freebeast(7.1-P)[12] ls -la > > total 1460 > > drwxrwxr-x 2 root operator 512 Nov 12 20:54 . > > drwxr-xr-x 14 root wheel 512 Jan 22 2008 .. > > -r--r----- 1 root operator 2410791056 Nov 12 20:54 test.1 > > freebeast(7.1-P)[13] > > David, thanks for chiming in. This is exactly what I was > fearing/worried about. > > It would be greatly beneficial if we could figure out what triggers the > slowdown for a lot of us, since for others (proof above) mksnap_ffs > behaves as expected. > > Since I'm able to reproduce this pretty much everywhere, here's > information: > > # df -ki /usr > Filesystem 1024-blocks Used Avail Capacity iused ifree %iused Mounted on > /dev/ad4s1f 163815904 3835274 146875358 3% 254864 20941934 1% /usr > > # cd /usr/.snap > # /usr/bin/time -l mksnap_ffs /usr test.1 > > > > load: 1.90 cmd: mksnap_ffs 11719 [wdrain] 0.00u 0.07s 0% 1092k > 23.25 real 0.00 user 0.00 sys > > 135.98 real 0.00 user 0.62 sys > 1092 maximum resident set size > 4 average shared memory size > 1081 average unshared data size > 135 average unshared stack size > 101 page reclaims > 0 page faults > 0 swaps > 895 block input operations > 13444 block output operations > 0 messages sent > 0 messages received > 0 signals received > 6433 voluntary context switches > 197 involuntary context switches > # ls -l test.1 > -r--r----- 1 root operator 173203463240 Nov 12 21:42 test.1 > > David's filesystem is 2GBs, while mine is 16GB. His snap takes under 1 > second, yet mine takes over 2 minutes. > > Possibly the large deviation is explained by the amount of space used on > the filesystem or the number of inodes in use? I also want to add that snapshot removal (e.g. rm test.1) is equally as slow (rm process is also in wdrain); takes about 20 seconds for the above test.1 snapshot. Maybe long durations during deletion are justified though, I don't know. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From kostikbel at gmail.com Thu Nov 13 02:26:47 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Thu Nov 13 02:26:55 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113044200.GA10419@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> Message-ID: <20081113102642.GQ47073@deviant.kiev.zoral.com.ua> On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > I've been playing around with snapshots lately but I've got a problem on > > > > one of my servers running 7-STABLE amd64: > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > the system completely freezes up: > > > > > > > > paladin# cd /u2/.snap/ > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > the same RAID has no issues. > > > > > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > > > > > To clarify "completely freezes up": unresponsive to all services over > > > > the network, except ping. On the console I can switch between the ttys, > > > > but none of them respond. The only way out is to hit the reset button. > > > > > > You need to provide information described in the > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > > and especially > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > > Ok, I've done that, and removed the patch that seemed to fix things. > > > > The first thing I notice after doing this on the console is that I can > > still ctrl+t the process: > > > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > > > But the top and ps I left running on other ttys have all stopped > > responding. > > Then in my book, the patch didn't fix anything. :-) The system is > still "deadlocking"; snapshot generation **should not** wedge the system > hard like this. You systematically mix two completely different issues: - first one is the _deadlock_ experienced by Tim; - second one is the slowdown during snapshot creation. In fact, I may count third, where dump itself hangs, as a usermode process, but kernel still normally operates. Patch posted should fix or paper over the first issue for practical means. Third issue most likely fixed by the subr_sleepqueue race fix. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081113/512cafbd/attachment.pgp From koitsu at FreeBSD.org Thu Nov 13 02:45:16 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Thu Nov 13 02:45:23 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113102642.GQ47073@deviant.kiev.zoral.com.ua> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> <20081113102642.GQ47073@deviant.kiev.zoral.com.ua> Message-ID: <20081113104514.GA17589@icarus.home.lan> On Thu, Nov 13, 2008 at 12:26:42PM +0200, Kostik Belousov wrote: > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > > I've been playing around with snapshots lately but I've got a problem on > > > > > one of my servers running 7-STABLE amd64: > > > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > > the system completely freezes up: > > > > > > > > > > paladin# cd /u2/.snap/ > > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > > the same RAID has no issues. > > > > > > > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > > > > > > > To clarify "completely freezes up": unresponsive to all services over > > > > > the network, except ping. On the console I can switch between the ttys, > > > > > but none of them respond. The only way out is to hit the reset button. > > > > > > > > You need to provide information described in the > > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > > > and especially > > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > > > > Ok, I've done that, and removed the patch that seemed to fix things. > > > > > > The first thing I notice after doing this on the console is that I can > > > still ctrl+t the process: > > > > > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > > > > > But the top and ps I left running on other ttys have all stopped > > > responding. > > > > Then in my book, the patch didn't fix anything. :-) The system is > > still "deadlocking"; snapshot generation **should not** wedge the system > > hard like this. > You systematically mix two completely different issues: > - first one is the _deadlock_ experienced by Tim; Re-read what he wrote. Quote: "Ok, I've done that, and removed the patch that seemed to fix things. The first thing I notice after doing this on the console is that I can still ctrl+t the process: load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k But the top and ps I left running on other ttys have all stopped responding." If he can press Control-T, it means SIGINFO can be sent to the mksnap_ffs process, and the process responds with that information. So, the system is not deadlocked -- meaning, I believe what he experiences is what others experience (the system becomes completely unusable during mksnap_ffs running, but DOES NOT hang or lock up, it just becomes so god-awful slow that processes on the machine literally sit and spin for minutes at a time). > - second one is the slowdown during snapshot creation. > In fact, I may count third, where dump itself hangs, as a usermode process, > but kernel still normally operates. > > Patch posted should fix or paper over the first issue for practical means. > Third issue most likely fixed by the subr_sleepqueue race fix. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From kometen at gmail.com Thu Nov 13 03:22:13 2008 From: kometen at gmail.com (Claus Guttesen) Date: Thu Nov 13 03:22:21 2008 Subject: qlogic qle2462 hba and freebsd stable on a dl360 g5 Message-ID: Hi. I'm looking at a qlogic qle2462 hba for my dl360 g5. The thread http://www.mail-archive.com/freebsd-stable@freebsd.org/msg99497.html mentions a deadlock when system is loaded. Has this issue been resolved? Are there other PCI Express hba's which are known to work with freebsd stable and dl360 g5? -- regards Claus When lenity and cruelty play for a kingdom, the gentler gamester is the soonest winner. Shakespeare From david at esn.org.za Thu Nov 13 03:24:19 2008 From: david at esn.org.za (David Peall) Date: Thu Nov 13 03:24:27 2008 Subject: ipfw erratic on 7 stable Message-ID: Hi I'm having a problem with ipfw, I think. For some reason it denies packets randomly for example: PING 196.14.239.2 (196.14.239.2): 56 data bytes ping: sendto: Permission denied ping: sendto: Permission denied 64 bytes from 196.14.239.2: icmp_seq=2 ttl=63 time=0.258 ms 64 bytes from 196.14.239.2: icmp_seq=3 ttl=63 time=0.233 ms 64 bytes from 196.14.239.2: icmp_seq=4 ttl=63 time=0.211 ms Not sure what else would be helpful at this point. Running: FreeBSD 7.1-PRERELEASE #0: Fri Oct 31 09:44:07 UTC 2008 -- David Peall :: IT Manager e-Schools' Network :: http://www.esn.org.za/ Phone +27 (021) 674-9140 From david at esn.org.za Thu Nov 13 04:26:14 2008 From: david at esn.org.za (David Peall) Date: Thu Nov 13 04:26:21 2008 Subject: ipfw erratic on 7 stable In-Reply-To: References: Message-ID: Ok Sorry: ipfw: install_state: Too many dynamic rules Any idea what is a safe number to use, I've set net.inet.ip.fw.dyn_max: 8192 regards -- David Peall :: IT Manager e-Schools' Network :: http://www.esn.org.za/ Phone +27 (021) 674-9140 From koitsu at FreeBSD.org Thu Nov 13 04:27:38 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Thu Nov 13 04:27:45 2008 Subject: ipfw erratic on 7 stable In-Reply-To: References: Message-ID: <20081113122736.GA21273@icarus.home.lan> On Thu, Nov 13, 2008 at 01:24:10PM +0200, David Peall wrote: > I'm having a problem with ipfw, I think. > > For some reason it denies packets randomly for example: > > PING 196.14.239.2 (196.14.239.2): 56 data bytes > ping: sendto: Permission denied > ping: sendto: Permission denied > 64 bytes from 196.14.239.2: icmp_seq=2 ttl=63 time=0.258 ms > 64 bytes from 196.14.239.2: icmp_seq=3 ttl=63 time=0.233 ms > 64 bytes from 196.14.239.2: icmp_seq=4 ttl=63 time=0.211 ms > > Not sure what else would be helpful at this point. > > Running: > FreeBSD 7.1-PRERELEASE #0: Fri Oct 31 09:44:07 UTC 2008 In my experiences, "Permission denied" is returned if you have a rule that blocks certain outbound packets; the OS tells the socket owner "no can do". There isn't enough information in the above report to help determine why it happens randomly; what flags have you passed ping? And please provide your entire ipfw ruleset, something may stand out. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From kostikbel at gmail.com Thu Nov 13 05:21:15 2008 From: kostikbel at gmail.com (Kostik Belousov) Date: Thu Nov 13 05:21:22 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113104514.GA17589@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> <20081113102642.GQ47073@deviant.kiev.zoral.com.ua> <20081113104514.GA17589@icarus.home.lan> Message-ID: <20081113132109.GT47073@deviant.kiev.zoral.com.ua> On Thu, Nov 13, 2008 at 02:45:14AM -0800, Jeremy Chadwick wrote: > On Thu, Nov 13, 2008 at 12:26:42PM +0200, Kostik Belousov wrote: > > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > > > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > > > I've been playing around with snapshots lately but I've got a problem on > > > > > > one of my servers running 7-STABLE amd64: > > > > > > > > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > > > > > > > > > > I run the mksnap_ffs command to take the snapshot and some time later > > > > > > the system completely freezes up: > > > > > > > > > > > > paladin# cd /u2/.snap/ > > > > > > paladin# mksnap_ffs /u2 test.1 > > > > > > > > > > > > It only happens on this one filesystem, though, which might be to do > > > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's > > > > > > also backed by a hardware RAID system, although a smaller filesystem on > > > > > > the same RAID has no issues. > > > > > > > > > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > > > > > > > > > > To clarify "completely freezes up": unresponsive to all services over > > > > > > the network, except ping. On the console I can switch between the ttys, > > > > > > but none of them respond. The only way out is to hit the reset button. > > > > > > > > > > You need to provide information described in the > > > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html > > > > > and especially > > > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > > > > > > > Ok, I've done that, and removed the patch that seemed to fix things. > > > > > > > > The first thing I notice after doing this on the console is that I can > > > > still ctrl+t the process: > > > > > > > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > > > > > > > But the top and ps I left running on other ttys have all stopped > > > > responding. > > > > > > Then in my book, the patch didn't fix anything. :-) The system is > > > still "deadlocking"; snapshot generation **should not** wedge the system > > > hard like this. > > You systematically mix two completely different issues: > > - first one is the _deadlock_ experienced by Tim; > > Re-read what he wrote. Quote: > > "Ok, I've done that, and removed the patch that seemed to fix things. > > The first thing I notice after doing this on the console is that I can > still ctrl+t the process: > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > > But the top and ps I left running on other ttys have all stopped > responding." > > If he can press Control-T, it means SIGINFO can be sent to the > mksnap_ffs process, and the process responds with that information. So, > the system is not deadlocked -- meaning, I believe what he experiences > is what others experience (the system becomes completely unusable during > mksnap_ffs running, but DOES NOT hang or lock up, it just becomes so > god-awful slow that processes on the machine literally sit and spin for > minutes at a time). Unless NOKERNINFO is specified in the local flags in the controlling terminal termios, kernel prints one line summary as shown above. This is done from the tty discipline input handler (or whatever it is in new tty code). No process cooperation is required. On the other hand, actually delivering SIGINFO and getting output from the process-installed handler do require process to either executing usermode or sleeping interruptible. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081113/5cf5489c/attachment.pgp From avg at icyb.net.ua Thu Nov 13 05:38:55 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Thu Nov 13 05:39:02 2008 Subject: find -L . -type l -delete Message-ID: <491C2DEB.3010504@icyb.net.ua> Am I stupid or is our 'find' is seriously broken in one subtle feature? $ find -L . -type l find all broken symlinks (target doesn't exists) $ find -L . -type l -delete removes all symlinks!!! FreeBSD 7.1-PRERELEASE amd64 -- Andriy Gapon From eugen at kuzbass.ru Thu Nov 13 06:17:30 2008 From: eugen at kuzbass.ru (Eugene Grosbein) Date: Thu Nov 13 06:17:37 2008 Subject: find -L . -type l -delete In-Reply-To: <491C2DEB.3010504@icyb.net.ua> References: <491C2DEB.3010504@icyb.net.ua> Message-ID: <20081113141726.GA26583@svzserv.kemerovo.su> On Thu, Nov 13, 2008 at 03:38:51PM +0200, Andriy Gapon wrote: > Am I stupid or is our 'find' is seriously broken in one subtle feature? > > $ find -L . -type l > find all broken symlinks (target doesn't exists) > > $ find -L . -type l -delete > removes all symlinks!!! Yes. > FreeBSD 7.1-PRERELEASE amd64 This is pretty old and known problem: http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/90687 Eugene Grosbein From avg at icyb.net.ua Thu Nov 13 06:33:53 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Thu Nov 13 06:34:00 2008 Subject: find -L . -type l -delete In-Reply-To: <20081113141726.GA26583@svzserv.kemerovo.su> References: <491C2DEB.3010504@icyb.net.ua> <20081113141726.GA26583@svzserv.kemerovo.su> Message-ID: <491C3AB2.5010706@icyb.net.ua> on 13/11/2008 16:17 Eugene Grosbein said the following: > On Thu, Nov 13, 2008 at 03:38:51PM +0200, Andriy Gapon wrote: > >> Am I stupid or is our 'find' is seriously broken in one subtle feature? >> >> $ find -L . -type l >> find all broken symlinks (target doesn't exists) >> >> $ find -L . -type l -delete >> removes all symlinks!!! > > Yes. > >> FreeBSD 7.1-PRERELEASE amd64 > > This is pretty old and known problem: > http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/90687 Wow, that's pretty tough. I always relied on the following 3 step procedure: 1. find ... -print 2. verify output 3. find ... -delete [Profit!] But now I will have to re-tune myself to xargs. -- Andriy Gapon From eugen at kuzbass.ru Thu Nov 13 07:27:40 2008 From: eugen at kuzbass.ru (Eugene Grosbein) Date: Thu Nov 13 07:27:47 2008 Subject: find -L . -type l -delete In-Reply-To: <491C3AB2.5010706@icyb.net.ua> References: <491C2DEB.3010504@icyb.net.ua> <20081113141726.GA26583@svzserv.kemerovo.su> <491C3AB2.5010706@icyb.net.ua> Message-ID: <20081113152736.GA33024@svzserv.kemerovo.su> On Thu, Nov 13, 2008 at 04:33:22PM +0200, Andriy Gapon wrote: > >> $ find -L . -type l -delete > >> removes all symlinks!!! > >> FreeBSD 7.1-PRERELEASE amd64 > > This is pretty old and known problem: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/90687 > Wow, that's pretty tough. > But now I will have to re-tune myself to xargs. Care to submit a followup to the PR with a note that it's still the problem for 7.1? Eugene Grosbein From freebsd at byshenk.net Thu Nov 13 08:41:38 2008 From: freebsd at byshenk.net (Greg Byshenk) Date: Thu Nov 13 08:41:45 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113044200.GA10419@icarus.home.lan> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> Message-ID: <20081113160810.GN907@core.byshenk.net> On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > The rest of the below information is good -- but I'm confused about > something: is there anyone out there who can use mksnap_ffs on a > filesystem (/usr is a good test source) and NOT experience this > deadlocking problem? Literally *every* FreeBSD box I have root access > to suffers from this problem, so I'm a little baffled why we end-users > need to keep providing debugging output when it should be easy as pie > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > their system wedge. As an answer to the question (and additional information), I am experiencing the problem, but not on all filesystems. This is under FreeBSD 7.1-PRERELEASE #7: Thu Nov 6 11:29:52 CET 2008, amd64 (from sources csup'ed immediately prior to the build). I have four filesystems used for data storage: /dev/da1p1 96850470 7866026 81236408 9% /export/mail /dev/da1p2 1937058312 972070320 810023328 55% /export/home /dev/da1p3 1937058312 79027008 1703066640 4% /export/misc /dev/da1p4 2598991534 271980564 2119091648 11% /export/spare I can successfully mksnap_ffs the first (smaller) partition, but an attempt to do so on any of the others causes a lock. Note: this is a lockup, not a "slow". The system becomes unresponsive to any input, and there is no hard drive activity, and this does not change over a period of more than 12 hours. -- greg byshenk - gbyshenk@byshenk.net - Leiden, NL From ambrisko at ambrisko.com Thu Nov 13 09:38:04 2008 From: ambrisko at ambrisko.com (Doug Ambrisko) Date: Thu Nov 13 09:38:10 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113132109.GT47073@deviant.kiev.zoral.com.ua> Message-ID: <200811131738.mADHc2Lh097312@ambrisko.com> Kostik Belousov writes: | On Thu, Nov 13, 2008 at 02:45:14AM -0800, Jeremy Chadwick wrote: [snip] | > If he can press Control-T, it means SIGINFO can be sent to the | > mksnap_ffs process, and the process responds with that information. So, | > the system is not deadlocked -- meaning, I believe what he experiences | > is what others experience (the system becomes completely unusable during | > mksnap_ffs running, but DOES NOT hang or lock up, it just becomes so | > god-awful slow that processes on the machine literally sit and spin for | > minutes at a time). | | Unless NOKERNINFO is specified in the local flags in the controlling | terminal termios, kernel prints one line summary as shown above. This is | done from the tty discipline input handler (or whatever it is in new tty | code). No process cooperation is required. On the other hand, actually | delivering SIGINFO and getting output from the process-installed | handler do require process to either executing usermode or sleeping | interruptible. Also note that "dead-lock" is not just a locking issue but can be WRT to other chains such as, hit the max buffer cache usage so the buffer daemon needs to flush things out but it can't since it needs a buffer but the buffer daemon can't get it since need to flush some. Things get really bad when the buffer daemon needs a buffer but can't! In theory it can go and use "emergency space" just for it to get out of this situation but it the buffer cache is fragmented such that all available buffers are to small then the buffer daemon is stuck on itself. Note that all stuff works except for anything that touch the buffer cache, such as a program coming off disk. A program in memory is okay. To really get a good picture of this you need to look at the various buffer cache variables via ddb (ie. hi, low, running etc.) A while back I wrote a debugging function to dump that state of things every minute or so. There are various loops you can get into. So then you start playing wack a mole. Usually due to the first bug you can't hit the 2nd, 3rd and so one adding to the fun. Unfortunately there isn't one magic bullet. These are not new problems since we hit them in 4.X. I did start to go over some of this issue with Tor but ran into ENOTIME on my side :-( Snap shots can take a very long time to make depending on the amount of stuff it has to snap shot and during that time it has to effectively lock out everything from the file system or the snap shot will be wrong. This then leads to a need for a good journaling fs that can be used on "big" disks (big, isn't that big anymore). Doug A. From reichcc at comcast.net Thu Nov 13 10:49:34 2008 From: reichcc at comcast.net (Patrick Reich) Date: Thu Nov 13 10:49:41 2008 Subject: System deadlock when using mksnap_ffs Message-ID: <1226601206.3091.37.camel@acheron> I'll just chime in briefly. I contacted Jeremy off the list about this issue a few days ago. I have one spare box i386 sitting here that I can happily test patches against; if I can be of help, let me know. > uname -a FreeBSD localhost.localdomain 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Tue Nov 11 21:40:27 CST 2008 user@localhost.localdomain:/usr/obj/usr/src/sys/GENERIC i386 > ident /boot/kernel/kernel | grep sleepqueue $FreeBSD: src/sys/kern/subr_sleepqueue.c,v 1.39.2.5 2008/09/16 20:01:57 jhb Exp $ Suffers from the description given by Jeremy: the box is not deadlocked during snapshot but I might as well walk away from it because I can't use it. I'd really like to see this get fixed; I rely on dump for backups. Regards, Pat -- "Jesus, can't I count on you people!?" --Oh Brother, Where Art Thou, George Clooney From bms at incunabulum.net Thu Nov 13 13:37:31 2008 From: bms at incunabulum.net (Bruce Simpson) Date: Thu Nov 13 13:37:38 2008 Subject: kern.ipc.maxpipekva exceeded; see tuning(7) Message-ID: <491C9E11.7040703@incunabulum.net> I just got lots and lots of this: kern.ipc.maxpipekva exceeded; see tuning(7) However, tuning(7) on my system has no information about this tunable whatsoever. anglepoise:~ % uname -a FreeBSD anglepoise.lon.incunabulum.net 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #3: Tue Nov 4 15:40:44 GMT 2008 root@anglepoise.lon.incunabulum.net:/home/obj/usr/src/sys/ANGLEPOISE7 amd64 anglepoise:~ % sysctl kern.ipc.maxpipekva kern.ipc.maxpipekva: 20971520 I was running a couple of copies of "synergys" at the time. After killing them, all seems fine, however this was causing most binaries on the system to error out with ENOMEM. Any ideas? BMS From bms at incunabulum.net Thu Nov 13 13:56:03 2008 From: bms at incunabulum.net (Bruce Simpson) Date: Thu Nov 13 13:56:11 2008 Subject: MFC Request In-Reply-To: <49199F76.5010800@freebsdbrasil.com.br> References: <49199F76.5010800@freebsdbrasil.com.br> Message-ID: <491C9E56.2050004@incunabulum.net> Patrick Tracanelli wrote: > Is it possible to have traceroute MFC'd for 7.1? I would like to have > -a and -A switchs (ASN Path mapping) available. Thank you :) > There's an AS lookup capable traceroute in ports: /usr/ports/net/ntraceroute is this insufficient for your needs? thanks BMS From thompsa at FreeBSD.org Thu Nov 13 14:55:07 2008 From: thompsa at FreeBSD.org (Andrew Thompson) Date: Thu Nov 13 14:55:14 2008 Subject: MFC Request In-Reply-To: <491C9E56.2050004@incunabulum.net> References: <49199F76.5010800@freebsdbrasil.com.br> <491C9E56.2050004@incunabulum.net> Message-ID: <20081113223246.GA79815@citylink.fud.org.nz> On Thu, Nov 13, 2008 at 09:38:30PM +0000, Bruce Simpson wrote: > Patrick Tracanelli wrote: >> Is it possible to have traceroute MFC'd for 7.1? I would like to have -a >> and -A switchs (ASN Path mapping) available. Thank you :) >> > > There's an AS lookup capable traceroute in ports: > /usr/ports/net/ntraceroute > > is this insufficient for your needs? Also note that Rui has completed the traceroute MFC for 7.1 From hartzell at alerce.com Thu Nov 13 16:23:39 2008 From: hartzell at alerce.com (George Hartzell) Date: Thu Nov 13 16:23:46 2008 Subject: problem moving gmirror between two machines. Message-ID: <18716.48723.452606.66518@almost.alerce.com> I have an HP DL360 with a pair of 1TB seagate disks that's been running -STABLE with a ZFS root partition set up using the tools available here: http://yds.coolrat.org/zfsboot.shtml It's been working great. As part of trying to understand what's going on, I csup'ed to -RELENG earlier today and rebuilt/installed the kernel and world whilst running on the DL360, so everything should be current. I tried to move the disks into an HP DL320 G4 and it fails to boot because it can't find /dev/mirror/boot (which it wants to mount onto /strap and then parts get nullfs'ed onto /boot and /rescue). It gives me the opportunity to start a shell, and from that shell I can do a zfs mount -a and get all of the zfs filesystems mounted, but there's nothing in /dev/mirror. No gmirror status and list are silent. I can move the disks back into the older machine and they work fine. I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the output from the two machines and they're identical. I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes /dev/ad4s1a (along with everything else) but doesn't do anything with it. Any ideas? g. From kmacy at freebsd.org Thu Nov 13 18:49:19 2008 From: kmacy at freebsd.org (Kip Macy) Date: Thu Nov 13 18:49:32 2008 Subject: kern.ipc.maxpipekva exceeded; see tuning(7) In-Reply-To: <491C9E11.7040703@incunabulum.net> References: <491C9E11.7040703@incunabulum.net> Message-ID: <3c1674c90811131819s199c836av33e51ebe131c1dcd@mail.gmail.com> I don't know off hand how you could end up with that many pipes. Nonetheless, sys_pipe.c has a good explanation of what that does and how pipe sizing works. -Kip On Thu, Nov 13, 2008 at 9:37 PM, Bruce Simpson wrote: > I just got lots and lots of this: > kern.ipc.maxpipekva exceeded; see tuning(7) > > However, tuning(7) on my system has no information about this tunable > whatsoever. > > anglepoise:~ % uname -a > FreeBSD anglepoise.lon.incunabulum.net 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE > #3: Tue Nov 4 15:40:44 GMT 2008 > root@anglepoise.lon.incunabulum.net:/home/obj/usr/src/sys/ANGLEPOISE7 amd64 > anglepoise:~ % sysctl kern.ipc.maxpipekva > kern.ipc.maxpipekva: 20971520 > > I was running a couple of copies of "synergys" at the time. After killing > them, all seems fine, however this was causing most binaries on the system > to error out with ENOMEM. > > Any ideas? > BMS > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > -- If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis From delphij at delphij.net Thu Nov 13 18:49:21 2008 From: delphij at delphij.net (Xin LI) Date: Thu Nov 13 18:49:33 2008 Subject: ZFS crashes on heavy threaded environment Message-ID: <491CE71F.2020208@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Pawel, We can still reproduce the ZFS crash (threading+heavy I/O load) on a fresh 7.1-STABLE build, in a few minutes: /usr/local/bin/iozone -M -e -+u -T -t 128 -S 4096 -L 64 -r 4k -s 30g -i 0 -i 1 -i 2 -i 8 -+p 70 -C I have included a backtrace output from my colleague who has his hands on the test environment. Should there is more information necessary please let us know and we wish to provide help on this. It looks like that the problem has been fixed with the new ZFS version in the last patchset against -CURRENT though. Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkc5x8ACgkQi+vbBBjt66B4igCfQTz0FM2yFYFwyJY26dVmdXCq ZeIAoJWGeWaBPNH31ZOoAnbbnottGzKQ =tcMs -----END PGP SIGNATURE----- -------------- next part -------------- Script started on Fri Nov 14 10:36:33 2008 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x18 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffffb4839fb6 stack pointer = 0x10:0xffffffffb4beb8f0 frame pointer = 0x10:0xffffffffb4beb920 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 326 (txg_thread_enter) trap number = 12 panic: page fault cpuid = 0 Uptime: 8m15s Physical memory: 8175 MB Dumping 629 MB: 614 598 582 566 550 534 518 502 486 470 454 438 422 406 390 374 358 342 326 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xffffffffb4beb590 in ?? () #2 0xffffffff8043cc59 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #3 0xffffffff8043d062 in panic (fmt=0x104
) at /usr/src/sys/kern/kern_shutdown.c:574 #4 0xffffffff806d4c43 in trap_fatal (frame=0xffffff000481b370, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:764 #5 0xffffffff806d5015 in trap_pfault (frame=0xffffffffb4beb840, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:680 #6 0xffffffff806d5958 in trap (frame=0xffffffffb4beb840) at /usr/src/sys/amd64/amd64/trap.c:449 #7 0xffffffff806bb14e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:209 #8 0xffffffffb4839fb6 in dmu_objset_sync_dnodes (list=0xffffff00047f46c0, tx=0xffffff0004f33280) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:707 #9 0xffffffffb483a11d in dmu_objset_sync (os=0xffffff00047f4600, pio=0xffffff0004367ac0, tx=0xffffff0004f33280) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:809 #10 0xffffffffb48476f2 in dsl_pool_sync (dp=0xffffff000479e400, txg=4864) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:188 #11 0xffffffffb4852020 in spa_sync (spa=0xffffff0004028000, txg=4864) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2989 #12 0xffffffffb4857eef in txg_sync_thread (arg=Variable "arg" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:331 #13 0xffffffff8041ae53 in fork_exit (callout=0xffffffffb4857dc0 , arg=0xffffff000479e400, frame=0xffffffffb4bebc80) at /usr/src/sys/kern/kern_fork.c:804 #14 0xffffffff806bb51e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:455 #15 0x0000000000000000 in ?? () #16 0x0000000000000000 in ?? () #17 0x0000000000000001 in ?? () #18 0x0000000000000000 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0x0000000000000000 in ?? () #39 0x0000000000c41000 in ?? () #40 0xffffffff809da880 in tdg_maxid () ---Type to continue, or q to quit--- #41 0xffffffff809e7080 in tdq_cpu () #42 0xffffffff809e7080 in tdq_cpu () #43 0xffffff000481b370 in ?? () #44 0xffffff000481b6a0 in ?? () #45 0xffffffffb4beb1c8 in ?? () #46 0x0000000000000000 in ?? () #47 0xffffffff8045d8b8 in sched_switch (td=0xffffffffb4857dc0, newtd=0x80055a5d0, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/sched_ule.c:1938 #48 0x0000000000000000 in ?? () #49 0x0000000000000000 in ?? () #50 0x0000000000000000 in ?? () #51 0x0000000000000000 in ?? () #52 0x0000000000000000 in ?? () #53 0x0000000000000000 in ?? () #54 0x0000000000000000 in ?? () #55 0x0000000000000000 in ?? () #56 0x0000000000000000 in ?? () #57 0x0000000000000000 in ?? () #58 0x0000000000000000 in ?? () #59 0x0000000000000000 in ?? () #60 0x0000000000000000 in ?? () #61 0x0000000000000000 in ?? () #62 0x0000000000000000 in ?? () #63 0x0000000000000000 in ?? () #64 0x0000000000000000 in ?? () #65 0x0000000000000000 in ?? () #66 0x0000000000000000 in ?? () #67 0x0000000000000000 in ?? () #68 0x0000000000000000 in ?? () #69 0x0000000000000000 in ?? () #70 0x0000000000000000 in ?? () #71 0x0000000000000000 in ?? () #72 0x0000000000000000 in ?? () #73 0x0000000000000000 in ?? () #74 0x0000000000000000 in ?? () #75 0x0000000000000000 in ?? () #76 0x0000000000000000 in ?? () #77 0x0000000000000000 in ?? () #78 0x0000000000000000 in ?? () #79 0x0000000000000000 in ?? () #80 0x0000000000000000 in ?? () #81 0x0000000000000000 in ?? () #82 0x0000000000000000 in ?? () #83 0x0000000000000000 in ?? () ---Type to continue, or q to quit--- #84 0x0000000000000000 in ?? () #85 0x0000000000000000 in ?? () #86 0x0000000000000000 in ?? () #87 0x0000000000000000 in ?? () #88 0x0000000000000000 in ?? () #89 0x0000000000000000 in ?? () #90 0x0000000000000000 in ?? () #91 0x0000000000000000 in ?? () #92 0x0000000000000000 in ?? () #93 0x0000000000000000 in ?? () #94 0x0000000000000000 in ?? () #95 0x0000000000000000 in ?? () #96 0x0000000000000000 in ?? () #97 0x0000000000000000 in ?? () #98 0x0000000000000000 in ?? () #99 0x0000000000000000 in ?? () #100 0x0000000000000000 in ?? () #101 0x0000000000000000 in ?? () #102 0x0000000000000000 in ?? () #103 0x0000000000000000 in ?? () #104 0x0000000000000000 in ?? () #105 0x0000000000000000 in ?? () #106 0x0000000000000000 in ?? () #107 0x0000000000000000 in ?? () #108 0x0000000000000000 in ?? () #109 0x0000000000000000 in ?? () #110 0x0000000000000000 in ?? () #111 0x0000000000000000 in ?? () #112 0x0000000000000000 in ?? () #113 0x0000000000000000 in ?? () #114 0x0000000000000000 in ?? () #115 0x0000000000000000 in ?? () Cannot access memory at address 0xffffffffb4bec000 (kgdb) quit Script done on Fri Nov 14 10:36:43 2008 From delphij at delphij.net Thu Nov 13 18:53:52 2008 From: delphij at delphij.net (Xin LI) Date: Thu Nov 13 18:53:59 2008 Subject: ZFS crashes on heavy threaded environment In-Reply-To: <491CE71F.2020208@delphij.net> References: <491CE71F.2020208@delphij.net> Message-ID: <491CE835.4050504@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Xin LI wrote: > Hi, Pawel, > > We can still reproduce the ZFS crash (threading+heavy I/O load) on a > fresh 7.1-STABLE build, in a few minutes: > > /usr/local/bin/iozone -M -e -+u -T -t 128 -S 4096 -L 64 -r 4k -s 30g -i > 0 -i 1 -i 2 -i 8 -+p 70 -C > > I have included a backtrace output from my colleague who has his hands > on the test environment. Should there is more information necessary > please let us know and we wish to provide help on this. Further datapoint. The system used to run with untuned loader.conf, and my colleague just reported that with the following loader.conf, the problem can be triggered sooner: vm.kmem_size_max=838860800 vm.kmem_size_scale="2" The system is running FreeBSD/amd64 7.1-PRERELEASE equipped with 8GB of RAM with GENERIC kernel. Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkc6DUACgkQi+vbBBjt66Cf7ACgjkeem1RLtDGCPF4MSvWY/1bz QvoAnjMIniOouszCIWl9JPLUn8KRuKlP =UchH -----END PGP SIGNATURE----- From hartzell at alerce.com Thu Nov 13 19:25:38 2008 From: hartzell at alerce.com (George Hartzell) Date: Thu Nov 13 19:25:45 2008 Subject: problem moving gmirror between two machines. In-Reply-To: <18716.48723.452606.66518@almost.alerce.com> References: <18716.48723.452606.66518@almost.alerce.com> Message-ID: <18716.61361.242320.333557@almost.alerce.com> George Hartzell writes: > [...] > It's been working great. As part of trying to understand what's going > on, I csup'ed to -RELENG earlier today and rebuilt/installed the > kernel and world whilst running on the DL360, so everything should be > current. > [...] Just to be clear, I mean that I have an up to date version of -STABLE on the machine (it claims to be 7.1-PRERELEASE), not that I'm running -CURRENT. g. From axel at dnepr.net Thu Nov 13 23:39:22 2008 From: axel at dnepr.net (Oleg Kozheltsev) Date: Thu Nov 13 23:39:31 2008 Subject: can't write dvd (interrupt storm) Message-ID: <491D28BD.4050309@dnepr.net> Hello, yesterday I buy new motherboard INTEL DP43TF and now I can't write dvd disks in any format (on PATA dvd-drive). but still can read any dvd and cd. cd write and blank procedure for dvd-rw done well too. when I start write dvd by growisofs (/dev/cd0 - atapicam), in log I get: Nov 14 08:28:06 gx kernel: acd0: FAILURE - RESERVE_TRACK ILLEGAL REQUEST asc=0x30 ascq=0x05 Nov 14 08:28:06 gx kernel: interrupt storm detected on "irq4:"; throttling interrupt source Nov 14 08:28:50 gx last message repeated times Nov 14 08:31:07 gx kernel: acd0: FAILURE - WRITE_BIG timed out if I use burncd (/dev/acd0. with, or without atapicam in kernel), then I get only second string in log. before upgrade, I had 6.3-RELEASE-p3, update to 7.0-RELEASE-p5, and after to 7.1-PRELEASE #5, it doesn't change anything. drive configuration in BIOS - too (change only SATA controller ICH10/AHCI). this IRQ is also used by USB and COM. turning off both in BIOS and/or kernel change IRQ map, but drive still not work correctly. present or not ACPI module - doesn't matter too. by the way, COM port I also can't reach (/dev/cuad0), but without interrupt storm in log's (through APCI or ISA. even if USB and atapicd is turned off). someone had the same problem?... thanks for attention :) system info: FreeBSD gx.dnepr.net 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #3: Thu Nov 13 16:23:09 EET 2008 root@gx.dnepr.net:/usr/obj/usr/src/sys/KERN i386 pcib3: irq 4 at device 28.3 on pci0 pci3: on pcib3 atapci0: port 0xd040-0xd047,0xd030-0xd033,0xd020-0xd027,0xd010-0xd013,0xd000-0xd00f irq 4 at device 0.0 on pci3 atapci0: [ITHREAD] ata2: on atapci0 ata2: [ITHREAD] usb4: on uhci3 usb4: USB revision 1.0 uhub4: on usb4 uhub4: 2 ports with 2 removable, self powered uhci4: port 0xf060-0xf07f irq 4 at device 29.1 on pci0 uhci4: [GIANT-LOCKED] uhci4: [ITHREAD] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A acd0: DVDR at ata2- UDMA33 cd0 at ata2 bus 0 target 1 lun 0 cd0: <_NEC DVD_RW ND-3550A 1.05> Removable CD-ROM SCSI-0 device cd0: 33.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present (if media present in boot time - interrupt storm again) From freebsd at byshenk.net Fri Nov 14 01:26:11 2008 From: freebsd at byshenk.net (Greg Byshenk) Date: Fri Nov 14 01:26:17 2008 Subject: System deadlock when using mksnap_ffs In-Reply-To: <20081113160810.GN907@core.byshenk.net> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> <20081113160810.GN907@core.byshenk.net> Message-ID: <20081114092608.GP907@core.byshenk.net> On Thu, Nov 13, 2008 at 05:08:10PM +0100, Greg Byshenk wrote: > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > > > The rest of the below information is good -- but I'm confused about > > something: is there anyone out there who can use mksnap_ffs on a > > filesystem (/usr is a good test source) and NOT experience this > > deadlocking problem? Literally *every* FreeBSD box I have root access > > to suffers from this problem, so I'm a little baffled why we end-users > > need to keep providing debugging output when it should be easy as pie > > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch > > their system wedge. > > As an answer to the question (and additional information), I am > experiencing the problem, but not on all filesystems. > > This is under FreeBSD 7.1-PRERELEASE #7: Thu Nov 6 11:29:52 CET 2008, > amd64 (from sources csup'ed immediately prior to the build). > > I have four filesystems used for data storage: > > /dev/da1p1 96850470 7866026 81236408 9% /export/mail > /dev/da1p2 1937058312 972070320 810023328 55% /export/home > /dev/da1p3 1937058312 79027008 1703066640 4% /export/misc > /dev/da1p4 2598991534 271980564 2119091648 11% /export/spare > > I can successfully mksnap_ffs the first (smaller) partition, but an > attempt to do so on any of the others causes a lock. > > Note: this is a lockup, not a "slow". The system becomes unresponsive > to any input, and there is no hard drive activity, and this does not > change over a period of more than 12 hours. As a followup to my own post, after reading this discussion, I applied the patches and rebuild my system last night. As of today, with the patched ffs_snapshot.c, I can now make snapshots of all the filesystems listed above. It takes rather a long time, but that is to be expected, I think, and the snapshots finish normally. -- greg byshenk - gbyshenk@byshenk.net - Leiden, NL From ivoras at freebsd.org Fri Nov 14 01:52:32 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Nov 14 01:52:40 2008 Subject: ZFS crashes on heavy threaded environment In-Reply-To: <491CE835.4050504@delphij.net> References: <491CE71F.2020208@delphij.net> <491CE835.4050504@delphij.net> Message-ID: Xin LI wrote: > Xin LI wrote: >> Hi, Pawel, > >> We can still reproduce the ZFS crash (threading+heavy I/O load) on a >> fresh 7.1-STABLE build, in a few minutes: > >> /usr/local/bin/iozone -M -e -+u -T -t 128 -S 4096 -L 64 -r 4k -s 30g -i >> 0 -i 1 -i 2 -i 8 -+p 70 -C Yes, this is known. >> I have included a backtrace output from my colleague who has his hands >> on the test environment. Should there is more information necessary >> please let us know and we wish to provide help on this. > > Further datapoint. The system used to run with untuned loader.conf, and > my colleague just reported that with the following loader.conf, the > problem can be triggered sooner: > > vm.kmem_size_max=838860800 > vm.kmem_size_scale="2" These two settings only serve to calculate vm.kmem_size, so you could simply skip them and adjust vm.kmem_size directly. > The system is running FreeBSD/amd64 7.1-PRERELEASE equipped with 8GB of > RAM with GENERIC kernel. You can tune vm.kmem_size to near 2 GB on your machine and OS version. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081114/ae891896/signature.pgp From dougb at FreeBSD.org Fri Nov 14 03:03:50 2008 From: dougb at FreeBSD.org (Doug Barton) Date: Fri Nov 14 03:03:59 2008 Subject: RELENG_7 build failure in rescue/ (iconv?) In-Reply-To: <491B5BF3.1010309@FreeBSD.org> References: <491B5BF3.1010309@FreeBSD.org> Message-ID: <491D5B13.2080800@FreeBSD.org> Ignore this, turned out that PEBCAK. Doug -- This .signature sanitized for your protection From sclark46 at earthlink.net Fri Nov 14 04:20:02 2008 From: sclark46 at earthlink.net (Stephen Clark) Date: Fri Nov 14 04:20:15 2008 Subject: FreeBSD 6.3 gre and traceroute In-Reply-To: <491C4EC2.2000802@earthlink.net> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> Message-ID: <491D6CED.50006@earthlink.net> Stephen Clark wrote: > Robert Noland wrote: >> On Thu, 2008-11-13 at 07:48 -0500, Stephen Clark wrote: >>> Julian Elischer wrote: >>>> Stephen Clark wrote: >>>>> Julian Elischer wrote: >>>>>> you will need to define the setup and question better. >>>> thanks.. cleaning it up a bit more... >>>> >>>> 10.0.129.1 FreeBSD workstation >>>> ^ >>>> | >>>> | ethernet >>>> | >>>> v >>>> 10.0.128.1 Freebsd FW "A" >>>> ^ >>>> | >>>> | gre / ipsec >>>> | >>>> v >>>> 192.168.3.1 FreeBSD FW "B" >>>> ^ >>>> | >>>> | ethernet >>>> | >>>> v >>>> 192.168.3.86 linux workstation >>>> >>>>> $ sudo traceroute 192.168.3.86 >>>>> traceroute to 192.168.3.86 (192.168.3.86), 64 hops max, 40 byte >>>>> packets >>>>> 1 HQFirewallRS.com (10.0.128.1) 0.575 ms 0.423 ms 0.173 ms >>>>> 2 * * * >>>>> 3 192.168.3.86 (192.168.3.86) 47.972 ms 45.174 ms 49.968 ms >>>>> >>>>> No response from the FreeBSD "B" box. >>>>> >>>>> When I do a tcpdump on "B" of the gre interface I see UDP packets >>>>> with a TTL of 1 but no ICMP response packets being sent back. >>>>> If I do the traceroute from the linux workstation 192.168.3.86 I get >>>>> similar results - I don't see a response from the FreeBSD "A" box. >>>> could you try using just GRE encasulation? >>>> (i.e. turn off IPSEC for now) >>>> >>>> I think that is much more likely to be where the problem is.. >>>> >>>> >>> I'll have to set this up to test it. >> >> The ttl exceeded is triggered from one of two places. Either >> netinet/ip_fastfwd.c if fast_forwarding is enabled or in >> netinet/ip_input.c. Look for the code relating to IPTTLDEC. This isn't >> your problem though... If ttl were not being decremented, the packet >> would just be forwarded on to the next hop (IP_STEALTH), which would >> just make the firewalls invisible. The fact that you are seeing * * * >> indicates that you are not receiving the ttl exceeded message for the >> packet sent with that particular ttl. I still think that the issue you >> are seeing is that one way or another the generated ICMP response isn't >> making it back onto the tunnel. Either via security policy, firewall or >> routing. > Your right, when I do a tcpdump on the gre interface I see the udp > packet come > in with a ttl=1 but I don't see a response icmp packet. I have tested > this with > all the firewalls disabled to make sure the icmp packet was not being > blocked. > I just ran another test and did tcpdump on all the other interfaces to > make sure > the icmp's were not being misrouted, it seems they are not being > generated for some reason. Also just using gre's without the underlying > ipsec tunnels seems to > work properly. >> >> robert. >> >>> What code in the FreeBSD kernel is responsible for generating the >>> response ICMP dest unreachable message? >>> > > Another data point I had been using option FILTER_GIF I tried a kernel without that option and it behaved the same. Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) From sclark46 at earthlink.net Fri Nov 14 06:31:29 2008 From: sclark46 at earthlink.net (Stephen Clark) Date: Fri Nov 14 06:31:35 2008 Subject: FreeBSD 6.3 ipsec and traceroute doesn't work as good as Linux -why? In-Reply-To: <491D6CED.50006@earthlink.net> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> Message-ID: <491D8BBC.8090201@earthlink.net> 10.0.129.1 FreeBSD workstation ^ | | ethernet | v 10.0.128.1 Freebsd FW "A" ^ | | ipsec | v 192.168.2.1 Linux FW "B" ^ | | ethernet | v 192.168.2.20 linux workstation from 192.168.2.20 Linux<->ipsec<->FreeBSD traceroute -I 10.0.129.1 traceroute to 10.0.129.1 (10.0.129.1), 30 hops max, 60 byte packets 1 192.168.2.1 (192.168.2.1) 0.434 ms 0.425 ms 0.423 ms 2 * * * 3 sclark (10.0.129.1) 42.418 ms 42.419 ms 42.727 ms traceroute -I 10.0.128.1 traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets 1 192.168.2.1 (192.168.2.1) 0.398 ms 0.504 ms 0.505 ms 2 10.0.128.1 (10.0.128.1) 36.066 ms 36.052 ms 37.800 ms traceroute 10.0.129.1 traceroute to 10.0.129.1 (10.0.129.1), 30 hops max, 60 byte packets 1 192.168.2.1 (192.168.2.1) 0.484 ms 0.464 ms 0.447 ms 2 * * * 3 sclark (10.0.129.1) 41.406 ms 41.391 ms 47.812 ms traceroute 10.0.128.1 traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets 1 (192.168.2.1) 0.473 ms 0.444 ms 0.427 ms 2 * * * 3 * * * 4 * * * 5 * * * 6 * * * 7 * * * 8 * * * 9 * * * 10 * * * 11 * * * 12 * *^C from 10.0.129.1 FreeBSD<->ipsec<->Linux sudo traceroute 192.168.2.20 traceroute to 192.168.2.20 (192.168.2.20), 64 hops max, 40 byte packets 1 HQFirewallRS.com (10.0.128.1) 0.761 ms 2.551 ms 4.017 ms 2 * * * 3 192.168.2.20 (192.168.2.20) 19.956 ms 27.425 ms 27.487 ms sclark:~ $ sudo traceroute 192.168.2.1 traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 40 byte packets 1 HQFirewallRS.com (10.0.128.1) 8.069 ms 2.952 ms 4.050 ms 2 home (192.168.2.1) 26.338 ms 22.132 ms 24.233 ms sclark:~ $ sudo traceroute -I 192.168.2.20 traceroute to 192.168.2.20 (192.168.2.20), 64 hops max, 60 byte packets 1 HQFirewallRS.com (10.0.128.1) 0.714 ms 0.806 ms 0.221 ms 2 home (192.168.2.1) 25.260 ms 25.312 ms 25.868 ms 3 192.168.2.20 (192.168.2.20) 36.477 ms 24.828 ms 24.903 ms sclark:~ $ sudo traceroute -I 192.168.2.1 traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 60 byte packets 1 HQFirewallRS.com (10.0.128.1) 2.219 ms 1.889 ms 4.491 ms 2 home (192.168.2.1) 26.172 ms 25.706 ms 24.981 ms tracerouteing to Linux never just gives a * * *, * * *, * * *, etc -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) From amon at aelita.org Fri Nov 14 07:14:50 2008 From: amon at aelita.org (Herve Boulouis) Date: Fri Nov 14 07:14:56 2008 Subject: Multiple panics with 7.1-PRERELEASE amd64/i386 and varnish In-Reply-To: <20081106150548.GF596@ra.aabs> References: <20081106102931.GD596@ra.aabs> <20081106150548.GF596@ra.aabs> Message-ID: <20081114161328.GO596@ra.aabs> Le 06/11/2008 16:05, Herve Boulouis a écrit: > Le 06/11/2008 11:29, Herve Boulouis a écrit: > > I just tried to reboot one of the boxes without kern.ipc.maxpipekva=104857600 to check for kva problems > but crashes persists, though the stack is completely different now. This time I included all the corrupt > parts of the stack that I had stripped in my original email but they are similar (from frame 18 to end). > > Any ideas ? We just found the same kind of crash with a 7.0-STABLE i386 from August so there is a serious bug in the kernel making varnish with file backend totaly unusable on FreeBSD 7. Backtrace : Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x52 fault code = supervisor read, page not present instruction pointer = 0x20:0xc093b90a stack pointer = 0x28:0xe4475ad0 frame pointer = 0x28:0xe4475ad0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 34 (syncer) trap number = 12 panic: page fault cpuid = 0 Uptime: 28m52s Physical memory: 1011 MB Dumping 148 MB: 133 117 101 85 69 53 37 21 5 Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc071b3a6 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc071b67e in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:572 #3 0xc09ec2dc in trap_fatal (frame=0xe4475a90, eva=82) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc09ec54b in trap_pfault (frame=0xe4475a90, usermode=0, eva=82) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc09ecf32 in trap (frame=0xe4475a90) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc09d31cb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc093b90a in vm_object_pip_add (object=0x0, i=1) at /usr/src/sys/vm/vm_object.c:273 #8 0xc078c207 in cluster_wbuild (vp=0xc4616564, size=16384, start_lbn=3, len=3) at /usr/src/sys/kern/vfs_cluster.c:925 #9 0xc07829a6 in vfs_bio_awrite (bp=0xd7fde3bc) at /usr/src/sys/kern/vfs_bio.c:1668 #10 0xc091593e in ffs_syncvnode (vp=0xc4616564, waitfor=3) at /usr/src/sys/ufs/ffs/ffs_vnops.c:283 #11 0xc0910e8d in ffs_sync (mp=0xc4209b30, waitfor=3, td=0xc4044660) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1234 #12 0xc079d4ef in sync_fsync (ap=0xe4475cd4) at /usr/src/sys/kern/vfs_subr.c:3217 #13 0xc0a01392 in VOP_FSYNC_APV (vop=0xc0affa60, a=0xe4475cd4) at vnode_if.c:1007 #14 0xc079dcd5 in sched_sync () at vnode_if.h:538 #15 0xc06f77f4 in fork_exit (callout=0xc079d5d0 , arg=0x0, frame=0xe4475d38) at /usr/src/sys/kern/kern_fork.c:781 #16 0xc09d3240 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:205 (kgdb) I kept the vmcores (i386 and amd64) if someone needs them. Regards, -- Herve Boulouis From hk at alogis.com Fri Nov 14 08:49:44 2008 From: hk at alogis.com (Holger Kipp) Date: Fri Nov 14 08:49:53 2008 Subject: FreeBSD 6.3 ipsec and traceroute doesn't work as good as Linux -why? In-Reply-To: <491D8BBC.8090201@earthlink.net> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> <491D8BBC.8090201@earthlink.net> Message-ID: <20081114163618.GA10409@intserv.int1.b.intern> On Fri, Nov 14, 2008 at 09:31:24AM -0500, Stephen Clark wrote: Dear Stephen, I don't want to be rude, but looking at your description I don't see what's wrong with the behaviour, but it seems you don't understand what '* * *' really means. How does traceroute work? Well, it sends out a packet with time to live (TTL) set to one. on the first hop, this will be reduced by each hop that it passes through, and if TTL reaches zero, a time exceeded message will be send back. Then another packet is send with TTL increased by one to identify the next hop and so on. If no answer is received, print out a '*' and try again (up to three tries by default). This process will stop if the last hop replies. It does not stop (or only after eg. 30 hops) if the last hop does not reply. Why is it that we sometimes do not get a reply? Possible answers: - fw-rules block these traceroute packages - routing for the answer packet is not set correctly - with IP-tunnel, the packet is not routed through the tunnel because it does not enter the ruleset from an external interface. This might be true for your firewalls. - ... So routing and fw-settings are very important here. You might want to check that first, before complaining ;-) In your setup you have not given both external and internal FW addresses. You might not want to have the FW be exposed on its internal interface to the remote network, instead you might want to have a transparent tunnel. Regards, Holger > 10.0.129.1 FreeBSD workstation > ^ > | > | ethernet > | > v > 10.0.128.1 Freebsd FW "A" > ^ > | > | ipsec > | > v > 192.168.2.1 Linux FW "B" > ^ > | > | ethernet > | > v > 192.168.2.20 linux workstation > > from 192.168.2.20 Linux<->ipsec<->FreeBSD > > traceroute -I 10.0.129.1 > traceroute to 10.0.129.1 (10.0.129.1), 30 hops max, 60 byte packets > 1 192.168.2.1 (192.168.2.1) 0.434 ms 0.425 ms 0.423 ms > 2 * * * > 3 sclark (10.0.129.1) 42.418 ms 42.419 ms 42.727 ms > > traceroute -I 10.0.128.1 > traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets > 1 192.168.2.1 (192.168.2.1) 0.398 ms 0.504 ms 0.505 ms > 2 10.0.128.1 (10.0.128.1) 36.066 ms 36.052 ms 37.800 ms > > traceroute 10.0.129.1 > traceroute to 10.0.129.1 (10.0.129.1), 30 hops max, 60 byte packets > 1 192.168.2.1 (192.168.2.1) 0.484 ms 0.464 ms 0.447 ms > 2 * * * > 3 sclark (10.0.129.1) 41.406 ms 41.391 ms 47.812 ms > > traceroute 10.0.128.1 > traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets > 1 (192.168.2.1) 0.473 ms 0.444 ms 0.427 ms > 2 * * * > 3 * * * > 4 * * * > 5 * * * > 6 * * * > 7 * * * > 8 * * * > 9 * * * > 10 * * * > 11 * * * > 12 * *^C > > > > from 10.0.129.1 FreeBSD<->ipsec<->Linux > sudo traceroute 192.168.2.20 > traceroute to 192.168.2.20 (192.168.2.20), 64 hops max, 40 byte packets > 1 HQFirewallRS.com (10.0.128.1) 0.761 ms 2.551 ms 4.017 ms > 2 * * * > 3 192.168.2.20 (192.168.2.20) 19.956 ms 27.425 ms 27.487 ms > > sclark:~ > $ sudo traceroute 192.168.2.1 > traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 40 byte packets > 1 HQFirewallRS.com (10.0.128.1) 8.069 ms 2.952 ms 4.050 ms > 2 home (192.168.2.1) 26.338 ms 22.132 ms 24.233 ms > > sclark:~ > $ sudo traceroute -I 192.168.2.20 > traceroute to 192.168.2.20 (192.168.2.20), 64 hops max, 60 byte packets > 1 HQFirewallRS.com (10.0.128.1) 0.714 ms 0.806 ms 0.221 ms > 2 home (192.168.2.1) 25.260 ms 25.312 ms 25.868 ms > 3 192.168.2.20 (192.168.2.20) 36.477 ms 24.828 ms 24.903 ms > > sclark:~ > $ sudo traceroute -I 192.168.2.1 > traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 60 byte packets > 1 HQFirewallRS.com (10.0.128.1) 2.219 ms 1.889 ms 4.491 ms > 2 home (192.168.2.1) 26.172 ms 25.706 ms 24.981 ms > > tracerouteing to Linux never just gives a * * *, * * *, * * *, etc > > -- > > "They that give up essential liberty to obtain temporary safety, > deserve neither liberty nor safety." (Ben Franklin) > > "The course of history shows that as a government grows, liberty > decreases." (Thomas Jefferson) > > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" From sclark46 at earthlink.net Fri Nov 14 10:30:12 2008 From: sclark46 at earthlink.net (Stephen Clark) Date: Fri Nov 14 10:30:19 2008 Subject: FreeBSD 6.3 ipsec and traceroute doesn't work as good as Linux -why? In-Reply-To: <20081114163618.GA10409@intserv.int1.b.intern> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> <491D8BBC.8090201@earthlink.net> <20081114163618.GA10409@intserv.int1.b.intern> Message-ID: <491DC3B1.10308@earthlink.net> Holger Kipp wrote: > On Fri, Nov 14, 2008 at 09:31:24AM -0500, Stephen Clark wrote: > > Dear Stephen, > > I don't want to be rude, but looking at your description I don't see > what's wrong with the behaviour, but it seems you don't understand what > '* * *' really means. > > How does traceroute work? Well, it sends out a packet with time to live > (TTL) set to one. on the first hop, this will be reduced by each hop that > it passes through, and if TTL reaches zero, a time exceeded message will > be send back. Then another packet is send with TTL increased by one to > identify the next hop and so on. > > If no answer is received, print out a '*' and try again (up to three tries > by default). > > This process will stop if the last hop replies. It does not stop (or only > after eg. 30 hops) if the last hop does not reply. > > Why is it that we sometimes do not get a reply? Possible answers: > - fw-rules block these traceroute packages > - routing for the answer packet is not set correctly > - with IP-tunnel, the packet is not routed through the tunnel because > it does not enter the ruleset from an external interface. This might > be true for your firewalls. > - ... > > So routing and fw-settings are very important here. You might want to > check that first, before complaining ;-) > > In your setup you have not given both external and internal FW addresses. > You might not want to have the FW be exposed on its internal interface > to the remote network, instead you might want to have a transparent tunnel. > > Regards, > Holger > > >> 10.0.129.1 FreeBSD workstation >> ^ >> | >> | ethernet >> | >> v >> internal 10.0.128.1 Freebsd FW "A" public ip address >> ^ >> | >> | ipsec >> | >> v public ip address internal 192.168.2.1 Linux FW "B" >> ^ >> | >> | ethernet >> | >> v >> 192.168.2.20 linux workstation >> >> from 192.168.2.20 Linux<->ipsec<->FreeBSD >> >> traceroute -I 10.0.129.1 >> traceroute to 10.0.129.1 (10.0.129.1), 30 hops max, 60 byte packets >> 1 192.168.2.1 (192.168.2.1) 0.434 ms 0.425 ms 0.423 ms >> 2 * * * >> 3 sclark (10.0.129.1) 42.418 ms 42.419 ms 42.727 ms >> >> traceroute -I 10.0.128.1 >> traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets >> 1 192.168.2.1 (192.168.2.1) 0.398 ms 0.504 ms 0.505 ms >> 2 10.0.128.1 (10.0.128.1) 36.066 ms 36.052 ms 37.800 ms >> >> traceroute 10.0.129.1 >> traceroute to 10.0.129.1 (10.0.129.1), 30 hops max, 60 byte packets >> 1 192.168.2.1 (192.168.2.1) 0.484 ms 0.464 ms 0.447 ms >> 2 * * * >> 3 sclark (10.0.129.1) 41.406 ms 41.391 ms 47.812 ms >> >> traceroute 10.0.128.1 >> traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets >> 1 (192.168.2.1) 0.473 ms 0.444 ms 0.427 ms >> 2 * * * >> 3 * * * >> 4 * * * >> 5 * * * >> 6 * * * >> 7 * * * >> 8 * * * >> 9 * * * >> 10 * * * >> 11 * * * >> 12 * *^C >> >> >> >> from 10.0.129.1 FreeBSD<->ipsec<->Linux >> sudo traceroute 192.168.2.20 >> traceroute to 192.168.2.20 (192.168.2.20), 64 hops max, 40 byte packets >> 1 HQFirewallRS.com (10.0.128.1) 0.761 ms 2.551 ms 4.017 ms >> 2 * * * >> 3 192.168.2.20 (192.168.2.20) 19.956 ms 27.425 ms 27.487 ms >> >> sclark:~ >> $ sudo traceroute 192.168.2.1 >> traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 40 byte packets >> 1 HQFirewallRS.com (10.0.128.1) 8.069 ms 2.952 ms 4.050 ms >> 2 home (192.168.2.1) 26.338 ms 22.132 ms 24.233 ms >> >> sclark:~ >> $ sudo traceroute -I 192.168.2.20 >> traceroute to 192.168.2.20 (192.168.2.20), 64 hops max, 60 byte packets >> 1 HQFirewallRS.com (10.0.128.1) 0.714 ms 0.806 ms 0.221 ms >> 2 home (192.168.2.1) 25.260 ms 25.312 ms 25.868 ms >> 3 192.168.2.20 (192.168.2.20) 36.477 ms 24.828 ms 24.903 ms >> >> sclark:~ >> $ sudo traceroute -I 192.168.2.1 >> traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 60 byte packets >> 1 HQFirewallRS.com (10.0.128.1) 2.219 ms 1.889 ms 4.491 ms >> 2 home (192.168.2.1) 26.172 ms 25.706 ms 24.981 ms >> >> tracerouteing to Linux never just gives a * * *, * * *, * * *, etc >> >> -- >> >> "They that give up essential liberty to obtain temporary safety, >> deserve neither liberty nor safety." (Ben Franklin) >> >> "The course of history shows that as a government grows, liberty >> decreases." (Thomas Jefferson) >> >> >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > Hi Holger, Thanks for the reply. During my test I had the firewalls on all system disabled, The problem is the FreeBSD FW does not respond correctly even if I use the -I option on traceroute which uses ICMP packets instead of UDP packets. And I agree it looks to be some kind of routing problem - I put a diag in the freebsd kernel ip_input.c if (ip->ip_ttl <= IPTTLDEC) { icmp_error(m, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS, 0, 0); return; to make sure it was calling icmp_error - it was. I have complementary setups on both the FreeBSD and Linux sides. It just seems that Linux handles things better than FreeBSD. EG when tracerouting from Linux to internal address on FreeBSD FW: >> traceroute 10.0.128.1 >> traceroute to 10.0.128.1 (10.0.128.1), 30 hops max, 60 byte packets >> 1 (192.168.2.1) 0.473 ms 0.444 ms 0.427 ms >> 2 * * * >> 3 * * * >> 4 * * * >> 5 * * * >> 6 * * * >> 7 * * * >> 8 * * * >> 9 * * * >> 10 * * * >> 11 * * * >> 12 * *^C But when tracerouting from FreeBSD to internal address on Linux FW. sudo traceroute 192.168.2.1 >> traceroute to 192.168.2.1 (192.168.2.1), 64 hops max, 40 byte packets >> 1 HQFirewallRS.com (10.0.128.1) 8.069 ms 2.952 ms 4.050 ms >> 2 home (192.168.2.1) 26.338 ms 22.132 ms 24.233 ms Much more meaningful results! From sclark46 at earthlink.net Fri Nov 14 10:37:03 2008 From: sclark46 at earthlink.net (Stephen Clark) Date: Fri Nov 14 10:37:10 2008 Subject: FreeBSD 6.3 gre and traceroute In-Reply-To: <491DC28E.80804@elischer.org> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> <491DC28E.80804@elischer.org> Message-ID: <491DC54A.1090907@earthlink.net> Julian Elischer wrote: > Stephen Clark wrote: >> Stephen Clark wrote: > >>>>>> >>>>>> 10.0.129.1 FreeBSD workstation >>>>>> ^ >>>>>> | >>>>>> | ethernet >>>>>> | >>>>>> v >>>>>> 10.0.128.1 Freebsd FW "A" >>>>>> ^ >>>>>> | >>>>>> | gre / ipsec >>>>>> | >>>>>> v >>>>>> 192.168.3.1 FreeBSD FW "B" >>>>>> ^ >>>>>> | >>>>>> | ethernet >>>>>> | >>>>>> v >>>>>> 192.168.3.86 linux workstation >>>>>> > >>> Also just using gre's without the underlying ipsec tunnels seems to >>> work properly. > > > This is the crux of the matter. > IPSEC happens INSIDE the IP stack. The IP stack is responsible for > the ICMP generation so it is much more likely that there is an > interaction there. > > Now is there an IPSEC rule to make sure that the ICMP packet can get > back? It could b ehtat in teh IP stack there is some confusion as to > whether the return packet should be encrypted or not and it might get > dropped. > > the code involved is in /sys/netinet and /sys/netipsec but you'll > probably regret looking in there ;-) > > > >>> >>> >> Another data point I had been using option FILTER_GIF I tried a kernel >> without that option and it behaved the same. >> >> Steve >> > I agree I put a diag in ip_input.c if (ip->ip_ttl <= IPTTLDEC) { icmp_error(m, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS, 0, 0); return; and sure enough it is calling icmp_error, but I think it can't figure out how to route the packet back. I been looking at my SPD to see if I can make some adjustment to the policy that would help. -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) From julian at elischer.org Fri Nov 14 10:37:36 2008 From: julian at elischer.org (Julian Elischer) Date: Fri Nov 14 10:37:44 2008 Subject: FreeBSD 6.3 gre and traceroute In-Reply-To: <491D6CED.50006@earthlink.net> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> Message-ID: <491DC28E.80804@elischer.org> Stephen Clark wrote: > Stephen Clark wrote: >>>>> >>>>> 10.0.129.1 FreeBSD workstation >>>>> ^ >>>>> | >>>>> | ethernet >>>>> | >>>>> v >>>>> 10.0.128.1 Freebsd FW "A" >>>>> ^ >>>>> | >>>>> | gre / ipsec >>>>> | >>>>> v >>>>> 192.168.3.1 FreeBSD FW "B" >>>>> ^ >>>>> | >>>>> | ethernet >>>>> | >>>>> v >>>>> 192.168.3.86 linux workstation >>>>> >> Also just using gre's without the >> underlying ipsec tunnels seems to >> work properly. This is the crux of the matter. IPSEC happens INSIDE the IP stack. The IP stack is responsible for the ICMP generation so it is much more likely that there is an interaction there. Now is there an IPSEC rule to make sure that the ICMP packet can get back? It could b ehtat in teh IP stack there is some confusion as to whether the return packet should be encrypted or not and it might get dropped. the code involved is in /sys/netinet and /sys/netipsec but you'll probably regret looking in there ;-) >> >> > Another data point I had been using option FILTER_GIF I tried a kernel > without that option and it behaved the same. > > Steve > From rnoland at FreeBSD.org Fri Nov 14 10:57:28 2008 From: rnoland at FreeBSD.org (Robert Noland) Date: Fri Nov 14 10:57:35 2008 Subject: FreeBSD 6.3 gre and traceroute In-Reply-To: <491DC28E.80804@elischer.org> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> <491DC28E.80804@elischer.org> Message-ID: <1226688153.1719.23.camel@squirrel.corp.cox.com> On Fri, 2008-11-14 at 10:25 -0800, Julian Elischer wrote: > Stephen Clark wrote: > > Stephen Clark wrote: > > >>>>> > >>>>> 10.0.129.1 FreeBSD workstation > >>>>> ^ > >>>>> | > >>>>> | ethernet > >>>>> | > >>>>> v > >>>>> 10.0.128.1 Freebsd FW "A" > >>>>> ^ > >>>>> | > >>>>> | gre / ipsec > >>>>> | > >>>>> v > >>>>> 192.168.3.1 FreeBSD FW "B" > >>>>> ^ > >>>>> | > >>>>> | ethernet > >>>>> | > >>>>> v > >>>>> 192.168.3.86 linux workstation > >>>>> > > >> Also just using gre's without the > >> underlying ipsec tunnels seems to > >> work properly. > > > This is the crux of the matter. > IPSEC happens INSIDE the IP stack. The IP stack is responsible for > the ICMP generation so it is much more likely that there is an > interaction there. > > Now is there an IPSEC rule to make sure that the ICMP packet can get > back? It could b ehtat in teh IP stack there is some confusion as to > whether the return packet should be encrypted or not and it might get > dropped. > > the code involved is in /sys/netinet and /sys/netipsec but you'll > probably regret looking in there ;-) Right, I don't really know the IPSEC code, but I was told by someone who is familiar with it that this is a known problem and that the use of GRE is not relevant. Hopefully he will have a moment to respond to this thread with a bit more detail. robert. > > > >> > >> > > Another data point I had been using option FILTER_GIF I tried a kernel > > without that option and it behaved the same. > > > > Steve > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081114/b5b5f0f7/attachment.pgp From rizzo at iet.unipi.it Fri Nov 14 11:41:13 2008 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Fri Nov 14 11:41:19 2008 Subject: diskless+pxe notes Message-ID: <20081114194534.GA64986@onelab2.iet.unipi.it> Hi, i finally decided to try and use pxeboot to replace the etherboot method I was using so far for diskless setups. The goal is to fully share the server's root and /usr directories, as documented in diskless(8). I'd like to share the following notes, hopefully to go in the manpage. cheers luigi -------------- root path configuration ----------------- There seems to be a well known problem in pxeloader, see kern/106493 , where pxeloader defaults to using a root path of /pxeroot when offered "/" . The patch suggested in http://www.freebsd.org/cgi/query-pr.cgi?pr=106493 is trivial and judging from it I believe this is addressing a true bug and not a feature. Fortunately there is a workaround (suggested in the PR) which is using "//" as a root path. ------------- sharing /boot with the server --------------- I believe it is quite useful to share the whole root partition between the server and the diskless client. This would require at a minimum some conditional code in loader.conf (or loader.rc, etc) so that at least you point to different kernels. A minimalistic approach can be adding this line to /boot/loader.conf bootfile="kernel\\${loaddev};kernel" The variable $loaddev contains the name of the load device, which is "pxe0" in the case of pxeboot, and disk* in other cases when loading from the local disk. If you make sure that there is no 'kernel.disk*' on the directory, and instead there is a kernel.pxe0 in the same directory, then the diskless machines and the server will boot from the proper file. Unfortunately i don't know how to implement a conditional in /boot/loader.conf -- otherwise one could do much nicer things such as differentiate which modules to load and so on. --------------- pxeloader bug in 7.x --------------------------- Also worth mentioning is an annoying bug in pxeloader as compiled on 7.x, see http://www.freebsd.org/cgi/query-pr.cgi?pr=118222 i.e. the pxeloader in 7.x fails to proceed and prints a message "can't figure out which disk we are booting from". The workaround is using a pxeloader from FreeBSD6 works. I guess this is a compiler-related problem (given that 6.x uses gcc 3.4 as a compiler, while 7.x uses gcc 4.2). ----------------------------------------------------------------- From bzeeb-lists at lists.zabbadoz.net Sat Nov 15 02:40:18 2008 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Sat Nov 15 02:40:31 2008 Subject: FreeBSD 6.3 gre and traceroute In-Reply-To: <1226688153.1719.23.camel@squirrel.corp.cox.com> References: <491B2703.4080707@earthlink.net> <491B31F7.30200@elischer.org> <491B4345.80106@earthlink.net> <491B47D2.6010804@elischer.org> <491C2235.4090509@earthlink.net> <1226589468.1976.12.camel@wombat.2hip.net> <491C4EC2.2000802@earthlink.net> <491D6CED.50006@earthlink.net> <491DC28E.80804@elischer.org> <1226688153.1719.23.camel@squirrel.corp.cox.com> Message-ID: <20081115102746.K61259@maildrop.int.zabbadoz.net> On Fri, 14 Nov 2008, Robert Noland wrote: Hi, >>>> Also just using gre's without the >>>> underlying ipsec tunnels seems to >>>> work properly. The reason for this to my knowledge is: http://www.kame.net/dev/cvsweb2.cgi/kame/freebsd2/sys/netinet/ip_icmp.c#rev1.4 or looking at recent freebsd code: http://fxr.watson.org/fxr/source/netinet/ip_icmp.c#L164 Look for M_DECRYPTED. Now what happens in your case: you receive an IPSec ESP packet, which gets decryped, that sets M_DECRYPTED on the mbuf passes through various parts, gets up to gre, gets decapsulated is an IP packet (again) gets to ip_input, TTL expired, icmp_error and it's still the same mbuf that originally got the M_DECRYPTED set. Thus the packets is just freed and you never see anything. So thinking about this has nothing to do with gre (or gif for example as well) in first place. It's arguably that passing it on to another decapsulation the flag should be cleared when entering gre() for example. The other question of course is why we do not send the icmp error back even on plain ipsec? Is it because we could possibly leak information as it's not caught by the policy sending it back? /bz -- Bjoern A. Zeeb Stop bit received. Insert coin for new game. From danny at cs.huji.ac.il Sat Nov 15 03:03:35 2008 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sat Nov 15 03:03:42 2008 Subject: diskless+pxe notes In-Reply-To: <20081114194534.GA64986@onelab2.iet.unipi.it> References: <20081114194534.GA64986@onelab2.iet.unipi.it> Message-ID: > Hi, > i finally decided to try and use pxeboot to replace the etherboot > method I was using so far for diskless setups. > > The goal is to fully share the server's root and /usr directories, > as documented in diskless(8). I'd like to share the following > notes, hopefully to go in the manpage. > > cheers > luigi > Hi, With a slightly modified libstand/bootp.c - a PR was sent way back, but you can check ftp://ftp.cs.huji.ac.il/users/danny/freebsd/diskless-boot you can control the diskless boot options. by comenting out kernel= in /boot/defaults/loader.conf you can set in the dhcpd.conf. since most of the tags received via dhcp are placed in kenv, the crucial options are there! BTW, we use diskless servers/workstations for 90% of our hosts, the exception being: - the dhcp/tftp server - a 'lagged' server - the router/server get confused :-) - our mail servers, there is a bug somewhere, where some critical network resources get deadlocked. - our developement servers. the / of the diskless is almost identical to the server, but for many reasons, I like to keep it appart. The trick to overcome the read-only problem, is using unionfs for /etc: in rc.initdiskless: if [ -e /conf/union ]; then kldload unionfs mount_md 4096 /.etc mount_unionfs -o transparent /.etc /etc fi the /conf is nfs mounted from a central site, the location is passed via dhcp: confpath=`kenv conf-path` if [ -n "$confpath" ] ; then if [ "`expr $confpath : '\(.*\):'`" ] ; then echo Mounting $confpath on /conf mount_nfs $confpath /conf chkerr $? "mount_nfs $confpath /conf" to_umount="${to_umount} $confpath" fi fi the actual rc.conf is configured like this: eval `kenv | sed -n 's/^rc\.//p'` rm -f /etc/rc.conf /etc/rc.conf.local for fc in $conf0 $conf1 $conf2 $conf3 $conf4 $conf5 $conf6 $conf7 $conf8 $conf9 rc.conf.$hostname do ho=`expr $fc : '\(.*\):'` fl=`expr $fc : '.*/\(.*\)'` if [ "${ho}" != "" ]; then mp=`expr $fc : '\(.*\)/.*'` mount_nfs $mp /mnt > /dev/null 2>&1 if [ -f /mnt/$fl ]; then echo "# from $fc /mnt/$fl" >> /etc/rc.conf cat /mnt/$fl >> /etc/rc.conf fi umount /mnt > /dev/null 2>&1 elif [ -e /conf/$fc ] ; then echo "# from /conf/$fc" >> /etc/rc.conf cat /conf/$fc >> /etc/rc.conf fi done > -------------- root path configuration ----------------- > > There seems to be a well known problem in pxeloader, see > kern/106493 , where pxeloader defaults to using a root path of > /pxeroot when offered "/" . The patch suggested in > > http://www.freebsd.org/cgi/query-pr.cgi?pr=106493 > > is trivial and judging from it I believe this is addressing a > true bug and not a feature. Fortunately there is a workaround > (suggested in the PR) which is using "//" as a root path. > > ------------- sharing /boot with the server --------------- > > I believe it is quite useful to share the whole root > partition between the server and the diskless client. > This would require at a minimum some conditional code > in loader.conf (or loader.rc, etc) so that at least you > point to different kernels. > > A minimalistic approach can be adding this line to /boot/loader.conf > > bootfile="kernel\\${loaddev};kernel" > > The variable $loaddev contains the name of the load device, > which is "pxe0" in the case of pxeboot, and disk* in other > cases when loading from the local disk. > If you make sure that there is no 'kernel.disk*' on the > directory, and instead there is a kernel.pxe0 in the same > directory, then the diskless machines and the server will boot > from the proper file. > > Unfortunately i don't know how to implement a conditional > in /boot/loader.conf -- otherwise one could do much nicer things > such as differentiate which modules to load and so on. > > --------------- pxeloader bug in 7.x --------------------------- > Also worth mentioning is an annoying bug in pxeloader as compiled > on 7.x, see http://www.freebsd.org/cgi/query-pr.cgi?pr=118222 > i.e. the pxeloader in 7.x fails to proceed and prints a message > "can't figure out which disk we are booting from". > > The workaround is using a pxeloader from FreeBSD6 works. > I guess this is a compiler-related problem (given that 6.x uses gcc 3.4 > as a compiler, while 7.x uses gcc 4.2). > > ----------------------------------------------------------------- > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > From freebsd at the-irc.org Sat Nov 15 03:40:22 2008 From: freebsd at the-irc.org (The-IRC FreeBSD) Date: Sat Nov 15 03:40:30 2008 Subject: TCP Stack Issues Under FreeBSD 7.1 Message-ID: <322efb7b0811150326t7de120dfw874c4a5eec3b9c2d@mail.gmail.com> Hi, Anyone else noticing any TCP Stack requests for information under a useraccount with mild to moderate TCP activity on HTTP and other sorts of ports returns zero results back unless you are root. [site@Eden ~]$ netstat -i reports netstat: kvm not available: /dev/mem: Permission denied ifnet: symbol not defined [site@Eden ~]$ netstat -an [site@Eden ~]$ netstat -m 377/823/1200 mbufs in use (current/cache/total) 64/378/442/32768 mbuf clusters in use (current/cache/total/max) 64/315 mbuf+clusters out of packet secondary zone in use (current/cache) 2/386/388/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 230K/2505K/2736K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/525/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 57093 requests for I/O initiated by sendfile 0 calls to protocol drain routines FreeBSD Eden.The-IRC.Com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #4: Sun Nov 2 22:20:01 CST 2008 root@Eden.The-IRC.Com:/usr/obj/usr/src/sys/THE-IRC i386 From freebsd at the-irc.org Sat Nov 15 04:27:58 2008 From: freebsd at the-irc.org (The-IRC FreeBSD) Date: Sat Nov 15 04:28:05 2008 Subject: TCP Stack Issues Under FreeBSD 7.1 In-Reply-To: <322efb7b0811150326t7de120dfw874c4a5eec3b9c2d@mail.gmail.com> References: <322efb7b0811150326t7de120dfw874c4a5eec3b9c2d@mail.gmail.com> Message-ID: <322efb7b0811150427q60a128cbg66a9e312e7bd4b0f@mail.gmail.com> never mind, the TCP Stack is working perfectly under 7.1 unlike 7.0 with ghost entries from different users. It didn't show because root was behiend http. My mistake folks, keep up the good work ;) On Sat, Nov 15, 2008 at 5:26 AM, The-IRC FreeBSD wrote: > Hi, > > > Anyone else noticing any TCP Stack requests for information under a > useraccount with mild to moderate TCP activity on HTTP and other sorts of > ports returns zero results back unless you are root. > > [site@Eden ~]$ netstat -i reports > netstat: kvm not available: /dev/mem: Permission denied > ifnet: symbol not defined > [site@Eden ~]$ netstat -an > [site@Eden ~]$ netstat -m > 377/823/1200 mbufs in use (current/cache/total) > 64/378/442/32768 mbuf clusters in use (current/cache/total/max) > 64/315 mbuf+clusters out of packet secondary zone in use (current/cache) > 2/386/388/12800 4k (page size) jumbo clusters in use > (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 230K/2505K/2736K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/525/6656 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 57093 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > > > FreeBSD Eden.The-IRC.Com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #4: Sun > Nov 2 22:20:01 CST 2008 root@Eden.The-IRC.Com:/usr/obj/usr/src/sys/THE-IRC > i386 > From jwb at homer.att.com Sat Nov 15 10:22:11 2008 From: jwb at homer.att.com (J. W. Ballantine) Date: Sat Nov 15 10:22:18 2008 Subject: _nyssin undefined Message-ID: <200811151755.MAA26628@hera.homer.att.com> Hi Yesterday AM, 11/14, I cvs'ed the 7-stable sources and did a system build/install. Now all I get is: /lib/ld-elf.so.1: /lib/libc.so.7: Undefined symbol _nyssin. Other than a reinstall, is there any way to recover from this? Thanks Jim Ballantine From bh at izb.knu.ac.kr Sat Nov 15 12:18:56 2008 From: bh at izb.knu.ac.kr (Byung-Hee HWANG) Date: Sat Nov 15 12:19:02 2008 Subject: [OT] Waiting for 7.1 Message-ID: <491F2EAA.9040405@izb.knu.ac.kr> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 When does the official version of 7.1 (7.1-RELEASE) release out? Does it have some critical issue? Or how it goes? Just i'm waiting for 7.1 because of i have some plan with 7.1 personally. Cheer up, Ken and the Release Engineering Team! byunghee -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) iEYEARECAAYFAkkfLqoACgkQsCouaZaxlv7k4ACeMA8yzbi2dXxAFt5Xcp00wIHd +gkAnAuZv1I019+Ipi0F4VSMH9vjXsvM =Fm2L -----END PGP SIGNATURE----- From jhs at berklix.org Sat Nov 15 14:13:52 2008 From: jhs at berklix.org (Julian Stacey) Date: Sat Nov 15 14:14:00 2008 Subject: [OT] Waiting for 7.1 In-Reply-To: Your message "Sun, 16 Nov 2008 05:18:50 +0900." <491F2EAA.9040405@izb.knu.ac.kr> Message-ID: <200811152213.mAFMDPo8041209@fire.js.berklix.net> Hi, > When does the official version of 7.1 (7.1-RELEASE) release out? Does it > have some critical issue? Or how it goes? Just i'm waiting for 7.1 > because of i have some plan with 7.1 personally. Cheer up, Ken and the > Release Engineering Team! Previously there's been a big ToDO list, I took a quick look to see if I could quote a URL, but only see: http://www.freebsd.org/releases/7.1R/schedule.html lists RC2 builds 29 September 2008 http://www.freebsd.org/releng/index.html November 2008 No matter, re@ will release when ready :-) I'm running 7.1-BETA2 on AMD64, with a load of home built current ports too. No problems with src/, just usual ports/ things. Julian -- Julian Stacey: BSDUnixLinux C Prog Admin SysEng Consult Munich www.berklix.com Mail plain ASCII text. HTML & Base64 text are spam. www.asciiribbon.org From peterjeremy at optushome.com.au Sat Nov 15 14:46:10 2008 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Sat Nov 15 14:46:17 2008 Subject: _nyssin undefined In-Reply-To: <200811151755.MAA26628@hera.homer.att.com> References: <200811151755.MAA26628@hera.homer.att.com> Message-ID: <20081115224604.GG51761@server.vk2pj.dyndns.org> On 2008-Nov-15 12:55:12 -0500, "J. W. Ballantine" wrote: >Yesterday AM, 11/14, I cvs'ed the 7-stable sources and did a >system build/install. Now all I get is: > /lib/ld-elf.so.1: /lib/libc.so.7: Undefined symbol _nyssin. I don't recognize and can't find that symbol in my (older) 7-STABLE sources so I'm not sure what might have triggered it. However, I don't have ld-elf.so.1 in /lib - it's in /libexec. Is that a typo on your part? At what point do you get that error? Can you get to a single user mode shell? The previous ld-elf.so.1 is saved as /libexec/ld-elf.so.1.old - you can use tools in /rescue to rename it (you will need to use chflags to clear the schg flag on /libexec/ld-elf.so.1 before you can overwrite it). Unfortunately, there's no backup of /lib/libc.so.7 so if that's the problem, you will need to recover it from backups or a live filesystem CD. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081115/38359401/attachment.pgp From rorya+freebsd.org at TrueStep.com Sun Nov 16 00:22:49 2008 From: rorya+freebsd.org at TrueStep.com (Rory Arms) Date: Sun Nov 16 00:22:55 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime In-Reply-To: <1226078239.37011.37.camel@bauer.cse.buffalo.edu> References: <9592E887-75F3-473F-9581-F9C22A9936A6@TrueStep.com> <1226078239.37011.37.camel@bauer.cse.buffalo.edu> Message-ID: <9C64A87F-1359-4694-8238-6C4D4B025BE3@TrueStep.com> On 2008-11-07, at 12:17 , Ken Smith wrote: > On Fri, 2008-11-07 at 00:00 -0500, Rory Arms wrote: >> Well, if I can assist with further debugging, let me know. > > The person who followed up with a list of things that *may* have made > the problem go away mentioned one of the things was disabling powerd. > Do you have that enable, and if yes would you mind disabling it to see > if that's the culprit? Ken, Ok, guess something is amiss with the CD-ROM drive on this notebook, as in GNOME, it flashes an icon of a CD on the desktop from time to time, as if it has detected a disc in the drive. But of course there is no disc in the drive. I believe it did the same with 6.3 though, but as said before didn't ever panic due to this issue. So, some anecdotal info, after running RC2 for a few days now. It seems the pattern is that it seems to always panic a few minutes after a first cold boot, but then seems to remain stable after the second boot. Odd, as with 6.3 this didn't happen. So, I happened to catch a panic while working in the syscons console after one of these cold boots. As far as I can tell, the panic does have something to do with the the CD-ROM drive, as right after I saw this message on the console, it immediately paniced: acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 and then the panic is as follows: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0x78 fault code = supervisor read, page not present instruction pointer = 0x20:0xc06d39b9 stack pointer = 0x28:0xca865c10 frame pointer = 0x28:0xca865c14 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 19 (swi6: task queue) trap number = 12 panic: page fault Uptime: 1h9m7s Physical memory: 179MB Dumping 43MB: 28 12 Dump complete This is also the computer, as you may recall, that I can't ever get kgdb(1) to open the core dump file. Note the uptime on that particular boot was 1h because I pretty much let it sit idle after booting. So, gdm loaded and then I switched to syscons, logged in, and then pretty much let it idle, till it paniced. Hope that helps, - rory From bzeeb-lists at lists.zabbadoz.net Sun Nov 16 02:15:10 2008 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Sun Nov 16 02:15:23 2008 Subject: hangs for 7.1-PRE [was: problem possibly related to multi-ip jail patch?] In-Reply-To: <2192B50F-16AE-4BC8-ACEC-6C5B99804DA0@yellowspace.net> References: <2192B50F-16AE-4BC8-ACEC-6C5B99804DA0@yellowspace.net> Message-ID: <20081116100529.Y61259@maildrop.int.zabbadoz.net> On Sun, 16 Nov 2008, Lorenzo Perone wrote: Hi, > I've been experiencing problems with one of the machines running FreeBSD > 7.1-PRERELEASE #2: Thu Oct 16 20:23:09 CEST 2008 with the multi-ip patch > bz_jail7-20080920-01-at150161.diff, and I'm wondering if it possibly related > to the patch - in any case, any advice would be very welcome. bottom line is that most of this looks less likely to be a jail problem. > It happens that mysql (tried both 4.0 and 5.1, in 2 separate jails), at some > time stop responding to connections, and mysql gets stuck in sbwait state. It > is only killable with kill -9 Yeah, I had been seeing mysql hang or go to 99% CPU for years once in a while; it's been more rare the last months. I have seen it in- and outside of jails, with or without patches. You could try to see if you can get backtraces of those processes. > each of the two mysqlds is running in a jail on one private IP, serving > connections to a webserver nearby - the latter having one public and one > private IP, communicating with the other jail via the private network. > > I also experienced two complete system hangs (which must not be necessarily > related to the mysql problem) both during a shutdown -r now. one was a panic, > in another case the machine was still pingable but did not shut down > completely. I could only reset it over the DRAC. here's a screenshot I made > over the Dell RAC: http://lorenzo.yellowspace.net/stuck.png Looking at your image I see more problems before the shutdown so this as well is most likely not a jail problem. > Since I'm also using zfs there and the kernel has been built with the DTRACE > options. > > any advice (also about which more details that I should/could provide) would > be very welcome... I am Cc:ing the answer to stable@ and setting reply-to: to move the discussion there. /bz -- Bjoern A. Zeeb Stop bit received. Insert coin for new game. From barbara.xxx1975 at libero.it Sun Nov 16 04:25:04 2008 From: barbara.xxx1975 at libero.it (Barbara) Date: Sun Nov 16 04:25:12 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime Message-ID: <13170840.199821226838282753.JavaMail.defaultUser@defaultHost> >Ok, guess something is amiss with the CD-ROM drive on this notebook, >as in GNOME, it flashes an icon of a CD on the desktop from time to >time, as if it has detected a disc in the drive. But of course there >is no disc in the drive. I believe it did the same with 6.3 though, >but as said before didn't ever panic due to this issue. > >So, some anecdotal info, after running RC2 for a few days now. It >seems the pattern is that it seems to always panic a few minutes after >a first cold boot, but then seems to remain stable after the second >boot. Odd, as with 6.3 this didn't happen. So, I happened to catch a >panic while working in the syscons console after one of these cold >boots. As far as I can tell, the panic does have something to do with >the the CD-ROM drive, as right after I saw this message on the >console, it immediately paniced: > >acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 > >and then the panic is as follows: > >kernel trap 12 with interrupts disabled > >Fatal trap 12: page fault while in kernel mode >fault virtual address = 0x78 >fault code = supervisor read, page not present >instruction pointer = 0x20:0xc06d39b9 >stack pointer = 0x28:0xca865c10 >frame pointer = 0x28: 0xca865c14 >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = resume, IOPL = 0 >current process = 19 (swi6: task queue) >trap number = 12 >panic: page fault >Uptime: 1h9m7s >Physical memory: 179MB >Dumping 43MB: 28 12 >Dump complete Hi Rory, did you see my replies or are you missing them for any reason? Your panics and some aspects about how they happens look like mine to me, look here: http: //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html Unfortunately I got no answer about that and I've had no comment in the pr I've filed http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 I wonder if someone had the time to look at it. From rorya+freebsd.org at TrueStep.com Sun Nov 16 13:21:49 2008 From: rorya+freebsd.org at TrueStep.com (Rory Arms) Date: Sun Nov 16 13:21:56 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime In-Reply-To: <13170840.199821226838282753.JavaMail.defaultUser@defaultHost> References: <13170840.199821226838282753.JavaMail.defaultUser@defaultHost> Message-ID: <369CC50A-9CF4-4F9B-8D22-153294B93532@TrueStep.com> On 2008-11-16, at 7:24 , Barbara wrote: > > > >> Ok, guess something is amiss with the CD-ROM drive on this notebook, >> as > in GNOME, it flashes an icon of a CD on the desktop from time to >> time, as if > it has detected a disc in the drive. But of course there >> is no disc in the > drive. I believe it did the same with 6.3 though, >> but as said before didn't > ever panic due to this issue. >> >> So, some anecdotal info, after running RC2 for > a few days now. It >> seems the pattern is that it seems to always panic a few > minutes after >> a first cold boot, but then seems to remain stable after the > second >> boot. Odd, as with 6.3 this didn't happen. So, I happened to catch > a >> panic while working in the syscons console after one of these cold > >> boots. As far as I can tell, the panic does have something to do >> with >> the > the CD-ROM drive, as right after I saw this message on the >> console, it > immediately paniced: >> >> acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 >> > >> and then the panic is as follows: >> >> kernel trap 12 with interrupts disabled >> > >> Fatal trap 12: page fault while in kernel mode >> fault virtual address = 0x78 > >> fault code = supervisor read, page not present >> instruction pointer = > 0x20:0xc06d39b9 >> stack pointer = 0x28:0xca865c10 >> frame pointer = 0x28: > 0xca865c14 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, > pres 1, def32 1, gran 1 >> processor eflags = resume, IOPL = 0 >> current process > = 19 (swi6: task queue) >> trap number = 12 >> panic: page fault >> Uptime: > 1h9m7s >> Physical memory: 179MB >> Dumping 43MB: 28 12 >> Dump complete > > Hi Rory, > > did you see my replies or are you missing them for any reason? Yes, I have seen your replies. I must have missed the PR you mentioned last time, sorry. > > > Your panics and > some aspects about how they happens look like mine to me, look here: > http: > //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html Yes, indeed. That looks very similar to the issue I'm running into with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). At least you were able to open the core dump. Are you still able to open core dumps with RC2? > > > > Unfortunately I got no answer about that and I've had no comment in > the pr I've > filed http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 > I wonder if someone had > the time to look at it. > From barbara.xxx1975 at libero.it Sun Nov 16 14:29:03 2008 From: barbara.xxx1975 at libero.it (barbara) Date: Sun Nov 16 14:29:11 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime Message-ID: > > Hi Rory, > > > > did you see my replies or are you missing them for any reason? > > Yes, I have seen your replies. I must have missed the PR you mentioned > last time, sorry. No problem! > > Your panics and > > some aspects about how they happens look like mine to me, look here: > > http: > > //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html > > Yes, indeed. That looks very similar to the issue I'm running into > with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). > At least you were able to open the core dump. Are you still able to > open core dumps with RC2? > I'm not sure. I'm running STABLE and I had no panics after the branch has changed to RC2. It seems that my panics are not frequent as yours. Anyway my box freezed a couple of times after last newvers.sh and the symptoms looked like the same, with messages about acd0. I was able to ping it but it won't let me ssh in, like it was using all the cpus. About kgdb... I never used freebsd-update, so sorry if I'm saying something stupid, but could it be the case that the kernel has been built without debugging symbols or something like that? Does freebsd-update provide a kernel.debug? I've seen that you are not using shiny quad-core, but could you try building a kernel by yourself? I think that you could do it using a different, more powerful, freebsd box if you have it, or even on qemu. I could help if you wish. From p.christias at noc.ntua.gr Sun Nov 16 16:01:51 2008 From: p.christias at noc.ntua.gr (Panagiotis Christias) Date: Sun Nov 16 16:01:58 2008 Subject: qlogic qle2462 hba and freebsd stable on a dl360 g5 In-Reply-To: References: Message-ID: <20081117000147.GA52109@noc.ntua.gr> On Thu, Nov 13, 2008 at 12:22:11PM +0100, Claus Guttesen wrote: > Hi. > > I'm looking at a qlogic qle2462 hba for my dl360 g5. The thread > http://www.mail-archive.com/freebsd-stable@freebsd.org/msg99497.html > mentions a deadlock when system is loaded. Has this issue been > resolved? Are there other PCI Express hba's which are known to work > with freebsd stable and dl360 g5? Hello, no, the issue has not been resolved. The system still deadlocks regardless the value of tag openings (even when set to the minimum value of 2) and the filesystem gets corrupted or totally destroyed. I am still looking for a solution and willing to do any tests. Regards, Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From rorya+freebsd.org at TrueStep.com Sun Nov 16 16:02:29 2008 From: rorya+freebsd.org at TrueStep.com (Rory Arms) Date: Sun Nov 16 16:02:36 2008 Subject: 6.4-RC2 crashes after a few minutes of uptime In-Reply-To: References: Message-ID: On 2008-11-16, at 17:28 , barbara wrote: >>> Hi Rory, >>> >>> did you see my replies or are you missing them for any reason? >> >> Yes, I have seen your replies. I must have missed the PR you >> mentioned >> last time, sorry. > > No problem! > >>> Your panics and >>> some aspects about how they happens look like mine to me, look here: >>> http: >>> //lists.freebsd.org/pipermail/freebsd-stable/2008-October/ >>> 045865.html >> >> Yes, indeed. That looks very similar to the issue I'm running into >> with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). >> At least you were able to open the core dump. Are you still able to >> open core dumps with RC2? >> > > I'm not sure. I'm running STABLE and I had no panics after the > branch has changed to RC2. It seems that my panics are not frequent > as yours. > Anyway my box freezed a couple of times after last newvers.sh and > the symptoms looked like the same, with messages about acd0. I was > able to ping it but it won't let me ssh in, like it was using all > the cpus. > > About kgdb... > I never used freebsd-update, so sorry if I'm saying something > stupid, but could it be the case that the kernel has been built > without debugging symbols or something like that? Does freebsd- > update provide a kernel.debug? I haven't had to use a the kernel.debug file in the obj dir in a long time. As far as I know, these days, the GENERIC kernel includes debug symbols. And in cases when there aren't any debug symbols, that shouldn't prevent kgdb from loading, I wouldn't think. - rory From jrhett at netconsonance.com Sun Nov 16 16:45:21 2008 From: jrhett at netconsonance.com (Jo Rhett) Date: Sun Nov 16 16:45:52 2008 Subject: 3Ware 9000 series hangs under load In-Reply-To: <20081112204351.ccc51c2f.lehmann@ans-netz.de> References: <20081029170728.be7cc7ab.lehmann@ans-netz.de> <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> <20081112204351.ccc51c2f.lehmann@ans-netz.de> Message-ID: Philip Murray wrote: > Anyway, I stopped running 3dmd (or 3dm2 I think it's called now) to > monitor it, and the crashes went away. It's had hundreds of days > uptime since. We have never used 3dm2, and the 9500 units have been rock solid for us. > I've never been game enough to try newer versions of 3dm, but a > cronjob of tw_cli allows me to monitor it now without the lockups. > Might not be your problem, but it's worth a shot if all else fails. The driver logs all useful stuff, and the SEC logfile surfer does a good job of notifying you quickly. I can send you an SEC configuration for that if you want. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From jrhett at netconsonance.com Sun Nov 16 16:46:23 2008 From: jrhett at netconsonance.com (Jo Rhett) Date: Sun Nov 16 16:46:31 2008 Subject: 3Ware 9000 series hangs under load In-Reply-To: <95E9EA2C-C288-4F11-AD35-FE6AF6633A09@nevada.net.nz> References: <20081029170728.be7cc7ab.lehmann@ans-netz.de> <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> <20081112204351.ccc51c2f.lehmann@ans-netz.de> <95E9EA2C-C288-4F11-AD35-FE6AF6633A09@nevada.net.nz> Message-ID: On Nov 12, 2008, at 12:37 PM, Philip Murray wrote: > I just installed sysutils/tw_cli from ports, and it sets up some > 'periodic' scripts for you. To be precise it puts 407.status-3ware- > raid in /usr/local/etc/periodic/daily Don't use that. It's a very old version of the code. Use the binary version of tw_cli that matches the firmware on your controller. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From p.christias at noc.ntua.gr Sun Nov 16 17:13:23 2008 From: p.christias at noc.ntua.gr (Panagiotis Christias) Date: Sun Nov 16 17:13:36 2008 Subject: FreeBSD 7-STABLE, isp(4), QLE2462: panic & deadlocks In-Reply-To: <20081015175453.GA3260@noc.ntua.gr> References: <20081014222343.GA8706@noc.ntua.gr> <1224049455.1277.44.camel@brain.cc.rsu.ru> <20081015175453.GA3260@noc.ntua.gr> Message-ID: <20081117011317.GB52109@noc.ntua.gr> On Wed, Oct 15, 2008 at 08:54:53PM +0300, Panagiotis Christias wrote: > On Wed, Oct 15, 2008 at 09:44:15AM +0400, Oleg Sharoiko wrote: > > Hi! > > > > On Wed, 2008-10-15 at 01:23 +0300, Panagiotis Christias wrote: > > > > > However, when we connect them to the CX3-40, create and mount a new > > > partition and then do something as simple as "tar -C /san -xf ports.tgz" > > > the system panics and deadlocks. We have tried several FreeBSD versions > > > (6.3 i386/adm64, 7.0 i386/adm64, 7.1 i386/adm64 and lastly 7-STABLE i386 > > > - we also tried the latest 8-CURRENT snapshot but it panicked too soon). > > > The result is always the same; panic and deadlock. > > > > Try reducing the number of "tagged openings" with 'camcontrol tags' down > > to 46. If it doesn't work try reducing it further to 2. Also be advised > > that I've seen panics with geom_multipath in FreeBSD-7, unfortunately I > > had no time to test it in -current. > > > Hm.. that would probably explain the fact that I was unable to panic the > system when I had set the hint.isp.0.debug="0x1F" in /boot/device.hints. > > Currently I am stress testing the server with the tagged openings set to > 44 (first value tested). Until now there is no panic or deadlock. I am > trying concurrent tar extractions and rsync copies. The filesystem looks > ok till now according to fsck. I will let it write/copy/delete overnight > and tomorrow I will try different tagged opening values. > > Thank you for the hint! I am wondering what is the performance penalty > with decreased tagged openings. Also, is there anything else I could try > in order to get more useful debug output? I have at least three servers > that I could use for any kind of tests and I am willing to spend as much > time I can get to help solving the problem. > > Finally, the only output in the logs is: > > Expensive timeout(9) function: 0xc06f4210(0xc67e1200) 0.059422635 s > Expensive timeout(9) function: 0xc08d4fd0(0) 0.060676147 s > > I suppose that is related to the CAMDEBUG kernel config options. For the record, I have done many tests using several stressing tools in parallel, different FreeBSD versions (up to 7.1beta2), various filesystem configurations (plain ufs2 with softupdates, ufs2 and gjournal, zfs) and various tag openings values (down to 2). Regardless of the configuration, the system deadlocks, panics or the filesystem gets awfully corrupted within seconds, minutes or a few hours. The only configuration that seems to work without problems(?) but with a unacceptable *severe* performance penalty is when tag openings are set to minimum value of 2 (that is more or less same as disabling tagged command queueing at all). All tests ran using a 500 GB RAID5 LUN on an EMC Clariion CX340: da0 at isp0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-4 device da0: Serial Number CK200083100148 da0: 400.000MB/s transfers da0: Command Queueing Enabled da0: 512000MB (1048576000 512 byte sectors: 255H 63S/T 65270C) Previously, a Sun StorEdge T3 was tested which worked flawlessly but it had a 1 Gbps fibre channel interface, instead of a 4 Gbps that Clariion has, was recognized as a SCSI-3 device and had 2 tags openings (no surprise) by default: da1 at isp1 bus 0 target 0 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 100.000MB/s transfers da1: 241724MB (495050752 512 byte sectors: 255H 63S/T 30815C) As I mentioned before, I am willing to spend time or/and provide access to the system for testing and debugging. Regards, Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From hartzell at alerce.com Sun Nov 16 18:29:53 2008 From: hartzell at alerce.com (George Hartzell) Date: Sun Nov 16 18:30:00 2008 Subject: problem moving gmirror between two machines. In-Reply-To: <18716.48723.452606.66518@almost.alerce.com> References: <18716.48723.452606.66518@almost.alerce.com> Message-ID: <18720.55070.363778.698000@almost.alerce.com> George Hartzell writes: > > I have an HP DL360 with a pair of 1TB seagate disks that's been > running -STABLE with a ZFS root partition set up using the tools > available here: > > http://yds.coolrat.org/zfsboot.shtml > > It's been working great. As part of trying to understand what's going > on, I csup'ed to -RELENG earlier today and rebuilt/installed the > kernel and world whilst running on the DL360, so everything should be > current. > > I tried to move the disks into an HP DL320 G4 and it fails to boot > because it can't find /dev/mirror/boot (which it wants to mount onto > /strap and then parts get nullfs'ed onto /boot and /rescue). It gives > me the opportunity to start a shell, and from that shell I can do a > zfs mount -a and get all of the zfs filesystems mounted, but there's > nothing in /dev/mirror. No gmirror status and list are silent. > > I can move the disks back into the older machine and they work fine. > > I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the > output from the two machines and they're identical. > > I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes > /dev/ad4s1a (along with everything else) but doesn't do anything with > it. > > Any ideas? > [for the archives] Solved. gmirror had been set up with -h specifying the device, and although the newer server used the same device names for its disks (ad[46]) it assigned them to different hot swap bays. Once I switched the disks everything came up fine. g. From cmdlnkid at gmail.com Sun Nov 16 19:46:48 2008 From: cmdlnkid at gmail.com (CmdLnKid) Date: Sun Nov 16 19:47:19 2008 Subject: problem moving gmirror between two machines. In-Reply-To: <18720.55070.363778.698000@almost.alerce.com> References: <18716.48723.452606.66518@almost.alerce.com> <18720.55070.363778.698000@almost.alerce.com> Message-ID: On Sun, 16 Nov 2008 21:29 -0000, hartzell wrote: > George Hartzell writes: > > > > I have an HP DL360 with a pair of 1TB seagate disks that's been > > running -STABLE with a ZFS root partition set up using the tools > > available here: > > > > http://yds.coolrat.org/zfsboot.shtml > > > > It's been working great. As part of trying to understand what's going > > on, I csup'ed to -RELENG earlier today and rebuilt/installed the > > kernel and world whilst running on the DL360, so everything should be > > current. > > > > I tried to move the disks into an HP DL320 G4 and it fails to boot > > because it can't find /dev/mirror/boot (which it wants to mount onto > > /strap and then parts get nullfs'ed onto /boot and /rescue). It gives > > me the opportunity to start a shell, and from that shell I can do a > > zfs mount -a and get all of the zfs filesystems mounted, but there's > > nothing in /dev/mirror. No gmirror status and list are silent. > > > > I can move the disks back into the older machine and they work fine. > > > > I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the > > output from the two machines and they're identical. > > > > I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes > > /dev/ad4s1a (along with everything else) but doesn't do anything with > > it. > > > > Any ideas? > > > > [for the archives] > > Solved. gmirror had been set up with -h specifying the device, and > although the newer server used the same device names for its disks > (ad[46]) it assigned them to different hot swap bays. Once I switched > the disks everything came up fine. > > g. Wouldn't it be more feasible in this situation to just glabel the disks and mount them from /dev//