From avg at icyb.net.ua Mon Mar 2 05:36:52 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Mon Mar 2 05:36:59 2009 Subject: ieee working group on c++ posix bindings Message-ID: <49ABE0F1.1040307@icyb.net.ua> Something that I learned today and that may be interesting to people here: http://standards.ieee.org/announcements/PR_New_Lan_Standards.html IEEE has also begun work on IEEE P1003.27?, "Standard for Information Technology - POSIX? C++ Language Interfaces - Binding for System Application Program Interface (API)." This standard will provide a single, recommended method to allow portable C++ applications to make use of the POSIX standard interfaces, which will help to discourage the currently diverging practices in the industry which lead to poor design choices, inefficiencies, and incompatibilities. https://www.redhat.com/mailman/listinfo/posix-c++-wg -- Andriy Gapon From vascim at yahoo.com Mon Mar 2 06:24:38 2009 From: vascim at yahoo.com (Vasile Marii) Date: Mon Mar 2 06:33:37 2009 Subject: slow freebsd cripto-accelerating framework Message-ID: <965289.45194.qm@web38306.mail.mud.yahoo.com> I'm working to port a cripto accelerating device driver(it's custom made device) from linux (which works fine) to bsd (freebsd 7.1), but i couldn't get the same(decent) results as for linux. The driver for linux and for bsd both a started from the corresponding driver for geode LX cripo accelerator. I concluded that it's not the device and the bottleneck is somewhere in the kernel. I modified the original glxsb(geode crypto accelerator) driver and made it return immediately after receiving a cripto task (so the device actually does nothing aka device is taking zero time to cript any block of data) and the data is actually not cripted. I made this for debugging purposes to see if the kernel delivers enough data to the device. The netperf results between the two exactly the same machines(with a tunnel(AES-CBC with HMAC_SHA256) between them) with the exactly the same driver shows a throughput of maximum 20Mbps(without IPSEC tunnel i can get 94,1 Mbps). I've seen similar problems on some threads regarding VIA(which should work with 1,1 Gbps throughput). I've tested the device not cripting network traffic (meaning "feed" the device manually and give it data immediately after it finishes the previous) and i can get a full speed of 117 Mbps(meaning it should be enough for my needs of 100Mbps NIC). Does anybody have any better results on glxsb or via?(i mean a netperf test between two machines) or there is a hack or a setting in the kernel or somewhere else? Any help is appreciated. Thanks! --------------- Vasile Marii From vanhu at FreeBSD.org Mon Mar 2 07:06:28 2009 From: vanhu at FreeBSD.org (VANHULLEBUS Yvan) Date: Mon Mar 2 07:06:35 2009 Subject: slow freebsd cripto-accelerating framework In-Reply-To: <965289.45194.qm@web38306.mail.mud.yahoo.com> References: <965289.45194.qm@web38306.mail.mud.yahoo.com> Message-ID: <20090302145952.GA6708@zeninc.net> Hi. On Mon, Mar 02, 2009 at 05:57:56AM -0800, Vasile Marii wrote: [....] > The netperf results between the two exactly the same > machines(with a tunnel(AES-CBC with HMAC_SHA256) between them) with > the exactly the same driver shows a throughput of maximum > 20Mbps(without IPSEC tunnel i can get 94,1 Mbps). > I've seen similar problems on some threads regarding VIA(which > should work with 1,1 Gbps throughput). While doing some benchs on IPsec, the very first thing to do is to ensure you'll have no fragmentation for ESP packets. You can do that by updating TCPMSS on the fly (for example with Pf), or by changing MTU on TRAFFIC interfaces (and NOT on tunnel interfaces). Once you did that, then you can start to have a look at performances. And yes, it take time to do IPsec processing, so your throughput will be much lower than non-IPsec traffic on the same hosts. Yvan. From patfbsd at davenulle.org Mon Mar 2 11:14:36 2009 From: patfbsd at davenulle.org (Patrick =?ISO-8859-15?Q?Lamaizi=E8re?=) Date: Mon Mar 2 11:14:43 2009 Subject: slow freebsd cripto-accelerating framework In-Reply-To: <965289.45194.qm@web38306.mail.mud.yahoo.com> References: <965289.45194.qm@web38306.mail.mud.yahoo.com> Message-ID: <20090302201440.1c878fab@baby-jane.lamaiziere.net> Le Mon, 2 Mar 2009 05:57:56 -0800 (PST), Vasile Marii : > I'm working to port a cripto accelerating device driver(it's custom > made device) from linux (which works fine) to bsd (freebsd 7.1), but > i couldn't get the same(decent) results as for linux. The driver for > linux and for bsd both a started from the corresponding driver for > geode LX cripo accelerator. I concluded that it's not the device and > the bottleneck is somewhere in the kernel. I modified the original > glxsb(geode crypto accelerator) driver and made it return immediately > after receiving a cripto task (so the device actually does nothing > aka device is taking zero time to cript any block of data) and the > data is actually not cripted. I made this for debugging purposes to > see if the kernel delivers enough data to the device. The netperf > results between the two exactly the same machines(with a > tunnel(AES-CBC with HMAC_SHA256) between them) with the exactly the > same driver shows a throughput of maximum 20Mbps(without IPSEC tunnel > i can get 94,1 Mbps). I've seen similar problems on some threads > regarding VIA(which should work with 1,1 Gbps throughput). I've > tested the device not cripting network traffic (meaning "feed" the > device manually and give it data immediately after it finishes the > previous) and i can get a full speed of 117 Mbps(meaning it should be > enough for my needs of 100Mbps NIC). Does anybody have any better > results on glxsb or via?(i mean a netperf test between two machines) > or there is a hack or a setting in the kernel or somewhere else? Any > help is appreciated. Thanks! I didn't make any benchmark with glxsb on IPsec. I've made some measures with the cryptotest tool (see /usr/src/tools/tools/crypto/) and openssl. The throughput is far more better than 20 Mbits (around 150 Mbits with data size = 4096 bytes). So I don't think that the botleneck is in the crypto framework. But the throughput heavely depends of the size of the data see (on blue the current glxsb): http://user.lamaiziere.net/patrick/glxsb-171108/glxsb-perf.pdf Measured with the cryptotest tool. While I'm here, does someone know why there is a big drop in the throughput when the size is just > 4096 bytes (the size of a page on I386)? Regards. From chuckr at telenix.org Mon Mar 2 12:32:20 2009 From: chuckr at telenix.org (Chuck Robey) Date: Mon Mar 2 12:32:27 2009 Subject: howto configure FreeBSD's hal? Message-ID: <49AC3FDE.9040405@telenix.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I can't seem to find anything on how to set up hal on FreeBSD. I hope it's because I'm being lousy at searching, not that there just isn't anything on the subject. I think all I want is to set up my Logitech wireless PS/2 (via a USB to PS/2 converter) mouse, and a PS/2 keyboard. I have a RAID/1 (via a twa driver), I don't know if that affects my hal or not. I honestly would far arther do my own configuration, if I could only find anything written up on how to accomplish this on FreeBSD (current). Sure would appreciate a pointer to this. I have the idea that anything I could find would be written for Linux, which wouldn't be terribly correct for FreeBSD's device setup, am I right? -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmsP94ACgkQz62J6PPcoOlnCQCcCOTg3iBny8uIbWDLIZJvASfI v+MAnjtujaWT9pUawJVmJFKVMU2M5qKJ =relA -----END PGP SIGNATURE----- From nparhar at gmail.com Mon Mar 2 13:34:39 2009 From: nparhar at gmail.com (Navdeep Parhar) Date: Mon Mar 2 13:34:46 2009 Subject: puc support for a generic card (patch attached) Message-ID: This may interest puc and uart maintainers. I needed an extra serial port on my FreeBSD machine and bought a store-branded "1-Port Serial PCI Adapter" from a local computer store. This is what pciconf shows: puc0@pci0:4:1:0: class=0x070002 card=0x00011000 chip=0x98359710 rev=0x01 hdr=0x00 And here's what puc identified it as: puc0: port 0xec00-0xec07,0xe480-0xe487,0xe400-0xe407,0xe080-0xe087,0xe000-0xe007,0xdc00-0xdc0f irq 16 at device 1.0 on pci4 Visual inspection shows the card has missing circuitry and headers for the extra serial and parallel port that the chip supports. puc gave me 2 serial port and 1 parallel port devices for the card, and none of them would work (not even the first serial port device). I had to tweak pucdata.c to get the card working. Patch against HEAD is attached, and also pasted at the end of this email (in case this list drops attachements). Regards, Navdeep diff -r 025cb00d19d7 sys/dev/puc/puc.c --- a/sys/dev/puc/puc.c Sat Feb 28 12:42:37 2009 -0800 +++ b/sys/dev/puc/puc.c Mon Mar 02 12:21:07 2009 -0800 @@ -440,9 +440,6 @@ sc->sc_dev = dev; sc->sc_cfg = cfg; - /* We don't attach to single-port serial cards. */ - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) - return (EDOOFUS); error = puc_config(sc, PUC_CFG_GET_NPORTS, 0, &res); if (error) return (error); diff -r 025cb00d19d7 sys/dev/puc/pucdata.c --- a/sys/dev/puc/pucdata.c Sat Feb 28 12:42:37 2009 -0800 +++ b/sys/dev/puc/pucdata.c Mon Mar 02 12:21:07 2009 -0800 @@ -761,6 +761,12 @@ PUC_PORT_2P, 0x10, 8, 0, }, + { 0x9710, 0x9835, 0x1000, 1, + "NetMos NM9835 based 1-port serial", + DEFAULT_RCLK, + PUC_PORT_1S, 0x10, 4, 0, + }, + { 0x9710, 0x9835, 0xffff, 0, "NetMos NM9835 Dual UART and 1284 Printer port", DEFAULT_RCLK, -------------- next part -------------- A non-text attachment was scrubbed... Name: puc.patch Type: application/octet-stream Size: 896 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090302/ad791835/puc.obj From vascim at yahoo.com Mon Mar 2 23:13:22 2009 From: vascim at yahoo.com (Vasile Marii) Date: Mon Mar 2 23:13:29 2009 Subject: no kern.usercrypto Message-ID: <397057.67515.qm@web38302.mail.mud.yahoo.com> Hello everybody! I'm a newbie in BSD. I don't have /dev/crypto nor kern.usercrypto in sysctl...so where can i read something about enabling this on my systems. Thanks --------------- Vasile Marii From yanefbsd at gmail.com Tue Mar 3 00:34:46 2009 From: yanefbsd at gmail.com (Garrett Cooper) Date: Tue Mar 3 00:34:53 2009 Subject: howto configure FreeBSD's hal? In-Reply-To: <49AC3FDE.9040405@telenix.org> References: <49AC3FDE.9040405@telenix.org> Message-ID: <7d6fde3d0903030034p207f2832i7d01fa19349d8129@mail.gmail.com> On Mon, Mar 2, 2009 at 12:21 PM, Chuck Robey wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I can't seem to find anything on how to set up hal on FreeBSD. ?I hope it's > because I'm being lousy at searching, not that there just isn't anything on the > subject. ?I think all I want is to set up my Logitech wireless PS/2 (via a USB > to PS/2 converter) mouse, and a PS/2 keyboard. ?I have a RAID/1 (via a twa > driver), I don't know if that affects my hal or not. ?I honestly would far > arther do my own configuration, if I could only find anything written up on how > to accomplish this on FreeBSD (current). > > Sure would appreciate a pointer to this. ?I have the idea that anything I could > find would be written for Linux, which wouldn't be terribly correct for > FreeBSD's device setup, am I right? Hi Chuck, What does the output look like when you try to boot with `boot -v' and how recent is your kernel, world, what revision of the source tree are they based off of (RELENG_7, HEAD?) and is your version of hal up-to-date? Thanks, -Garrett From darcsis at gmail.com Tue Mar 3 05:29:08 2009 From: darcsis at gmail.com (Denise H. G.) Date: Tue Mar 3 05:29:15 2009 Subject: howto configure FreeBSD's hal? In-Reply-To: <49AC3FDE.9040405@telenix.org> (Chuck Robey's message of "Mon, 02 Mar 2009 15:21:50 -0500") References: <49AC3FDE.9040405@telenix.org> Message-ID: <86k576hik4.fsf@pluton.xbsd.name> >>>>> "Chuck" == Chuck Robey writes: Chuck> I can't seem to find anything on how to set up hal on Chuck> FreeBSD. I hope it's because I'm being lousy at searching, Chuck> not that there just isn't anything on the subject. I think Chuck> all I want is to set up my Logitech wireless PS/2 (via a USB Chuck> to PS/2 converter) mouse, and a PS/2 keyboard. I have a Chuck> RAID/1 (via a twa driver), I don't know if that affects my Chuck> hal or not. I honestly would far arther do my own Chuck> configuration, if I could only find anything written up on Chuck> how to accomplish this on FreeBSD (current). Chuck> Sure would appreciate a pointer to this. I have the idea that Chuck> anything I could find would be written for Linux, which Chuck> wouldn't be terribly correct for FreeBSD's device setup, am I Chuck> right? _______________________________________________ Chuck> freebsd-hackers@freebsd.org mailing list Chuck> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To Chuck> unsubscribe, send any mail to Chuck> "freebsd-hackers-unsubscribe@freebsd.org" HAL configuration is a complex if you don't know much about XML. Firstly I assume that you should be heading towards correct directions, then you just go ahead with XML stuff and HAL configuration. You would find some HAL stuff on RedHat websites. And google might be a good idea at the same time. Good luck. -- darcsis ZAI gmail DIAN com From jhb at freebsd.org Tue Mar 3 06:32:19 2009 From: jhb at freebsd.org (John Baldwin) Date: Tue Mar 3 06:32:41 2009 Subject: puc support for a generic card (patch attached) In-Reply-To: References: Message-ID: <200903030915.43037.jhb@freebsd.org> On Monday 02 March 2009 4:05:02 pm Navdeep Parhar wrote: > This may interest puc and uart maintainers. > > I needed an extra serial port on my FreeBSD machine and bought a > store-branded "1-Port Serial PCI Adapter" from a local computer > store. > > This is what pciconf shows: > puc0@pci0:4:1:0: class=0x070002 card=0x00011000 chip=0x98359710 > rev=0x01 hdr=0x00 > > And here's what puc identified it as: > puc0: port > 0xec00-0xec07,0xe480-0xe487,0xe400-0xe407,0xe080-0xe087,0xe000-0xe007,0xdc00-0xdc0f > irq 16 at device 1.0 on pci4 > > Visual inspection shows the card has missing circuitry and headers > for the extra serial and parallel port that the chip supports. puc > gave me 2 serial port and 1 parallel port devices for the card, and > none of them would work (not even the first serial port device). > > I had to tweak pucdata.c to get the card working. Patch against > HEAD is attached, and also pasted at the end of this email (in case > this list drops attachements). > > Regards, > Navdeep > > diff -r 025cb00d19d7 sys/dev/puc/puc.c > --- a/sys/dev/puc/puc.c Sat Feb 28 12:42:37 2009 -0800 > +++ b/sys/dev/puc/puc.c Mon Mar 02 12:21:07 2009 -0800 > @@ -440,9 +440,6 @@ > sc->sc_dev = dev; > sc->sc_cfg = cfg; > > - /* We don't attach to single-port serial cards. */ > - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) > - return (EDOOFUS); FWIW, the traditional reason for this is that we made the sio/uart or ppc drivers claim single port devices directly and only use puc for multiple-port cards. I'm not sure if that should still be the case or not. Marcel, do you have an opinion? > error = puc_config(sc, PUC_CFG_GET_NPORTS, 0, &res); > if (error) > return (error); > diff -r 025cb00d19d7 sys/dev/puc/pucdata.c > --- a/sys/dev/puc/pucdata.c Sat Feb 28 12:42:37 2009 -0800 > +++ b/sys/dev/puc/pucdata.c Mon Mar 02 12:21:07 2009 -0800 > @@ -761,6 +761,12 @@ > PUC_PORT_2P, 0x10, 8, 0, > }, > > + { 0x9710, 0x9835, 0x1000, 1, > + "NetMos NM9835 based 1-port serial", > + DEFAULT_RCLK, > + PUC_PORT_1S, 0x10, 4, 0, > + }, > + > { 0x9710, 0x9835, 0xffff, 0, > "NetMos NM9835 Dual UART and 1284 Printer port", > DEFAULT_RCLK, > -- John Baldwin From xcllnt at mac.com Tue Mar 3 08:49:07 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Tue Mar 3 08:49:16 2009 Subject: puc support for a generic card (patch attached) In-Reply-To: <200903030915.43037.jhb@freebsd.org> References: <200903030915.43037.jhb@freebsd.org> Message-ID: On Mar 3, 2009, at 6:15 AM, John Baldwin wrote: >> diff -r 025cb00d19d7 sys/dev/puc/puc.c >> --- a/sys/dev/puc/puc.c Sat Feb 28 12:42:37 2009 -0800 >> +++ b/sys/dev/puc/puc.c Mon Mar 02 12:21:07 2009 -0800 >> @@ -440,9 +440,6 @@ >> sc->sc_dev = dev; >> sc->sc_cfg = cfg; >> >> - /* We don't attach to single-port serial cards. */ >> - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) >> - return (EDOOFUS); > > FWIW, the traditional reason for this is that we made the sio/uart > or ppc > drivers claim single port devices directly and only use puc for > multiple-port > cards. I'm not sure if that should still be the case or not. > Marcel, do you > have an opinion? Yes :-) I explicitly added the test with that particular error code to make it absolutely clear that puc(4) is not the driver for single port cards. The reason being that it's pointless. There are 2 things that puc(4) facilitates in: resource assignment and interrupt handling. For single port cards there's nothing to distribute nor is there any interrupt sharing. In other words: there's no value that puc(4) adds. As such, uart(4) and ppc(4) can attach directly to those cards and puc(4) does not have to be involved. BTW: Traditionally puc(4) was used to attach even to single port cards. With the puc(4) rewrite I changed that, because it was really a mixed bag. Some single-port cards were known to puc(4) others to uart(4)/sio(4) or ppc(4). That typically leads to confusion given that puc(4) is (still) not in GENERIC. (i.e. why is this UART attached, but that one isn't, they're both single port?) So, please do not apply the patch and instead add the IDs to sys/dev/uart/uart_bus_pci.c... FYI, -- Marcel Moolenaar xcllnt@mac.com From jhb at freebsd.org Tue Mar 3 09:00:31 2009 From: jhb at freebsd.org (John Baldwin) Date: Tue Mar 3 09:00:38 2009 Subject: puc support for a generic card (patch attached) In-Reply-To: References: <200903030915.43037.jhb@freebsd.org> Message-ID: <200903031159.55299.jhb@freebsd.org> On Tuesday 03 March 2009 11:48:42 am Marcel Moolenaar wrote: > > On Mar 3, 2009, at 6:15 AM, John Baldwin wrote: > > >> diff -r 025cb00d19d7 sys/dev/puc/puc.c > >> --- a/sys/dev/puc/puc.c Sat Feb 28 12:42:37 2009 -0800 > >> +++ b/sys/dev/puc/puc.c Mon Mar 02 12:21:07 2009 -0800 > >> @@ -440,9 +440,6 @@ > >> sc->sc_dev = dev; > >> sc->sc_cfg = cfg; > >> > >> - /* We don't attach to single-port serial cards. */ > >> - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) > >> - return (EDOOFUS); > > > > FWIW, the traditional reason for this is that we made the sio/uart > > or ppc > > drivers claim single port devices directly and only use puc for > > multiple-port > > cards. I'm not sure if that should still be the case or not. > > Marcel, do you > > have an opinion? > > Yes :-) > > I explicitly added the test with that particular error code > to make it absolutely clear that puc(4) is not the driver > for single port cards. The reason being that it's pointless. > > There are 2 things that puc(4) facilitates in: resource > assignment and interrupt handling. For single port cards > there's nothing to distribute nor is there any interrupt > sharing. In other words: there's no value that puc(4) adds. > As such, uart(4) and ppc(4) can attach directly to those > cards and puc(4) does not have to be involved. > > BTW: Traditionally puc(4) was used to attach even to single > port cards. With the puc(4) rewrite I changed that, because > it was really a mixed bag. Some single-port cards were known > to puc(4) others to uart(4)/sio(4) or ppc(4). That typically > leads to confusion given that puc(4) is (still) not in GENERIC. > (i.e. why is this UART attached, but that one isn't, they're > both single port?) > > So, please do not apply the patch and instead add the IDs to > sys/dev/uart/uart_bus_pci.c... This sounds fine to me. :) Navdeep, can you develop a patch for uart(4) instead and test that? -- John Baldwin From xcllnt at mac.com Tue Mar 3 09:04:42 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Tue Mar 3 09:04:53 2009 Subject: puc support for a generic card (patch attached) In-Reply-To: <200903031159.55299.jhb@freebsd.org> References: <200903030915.43037.jhb@freebsd.org> <200903031159.55299.jhb@freebsd.org> Message-ID: <9B775F97-1E5A-4E55-A2AE-26DC78CD08C0@mac.com> On Mar 3, 2009, at 8:59 AM, John Baldwin wrote: > On Tuesday 03 March 2009 11:48:42 am Marcel Moolenaar wrote: >> >> On Mar 3, 2009, at 6:15 AM, John Baldwin wrote: >> >>>> diff -r 025cb00d19d7 sys/dev/puc/puc.c >>>> --- a/sys/dev/puc/puc.c Sat Feb 28 12:42:37 2009 -0800 >>>> +++ b/sys/dev/puc/puc.c Mon Mar 02 12:21:07 2009 -0800 >>>> @@ -440,9 +440,6 @@ >>>> sc->sc_dev = dev; >>>> sc->sc_cfg = cfg; >>>> >>>> - /* We don't attach to single-port serial cards. */ >>>> - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) >>>> - return (EDOOFUS); >>> >>> FWIW, the traditional reason for this is that we made the sio/uart >>> or ppc >>> drivers claim single port devices directly and only use puc for >>> multiple-port >>> cards. I'm not sure if that should still be the case or not. >>> Marcel, do you >>> have an opinion? >> >> Yes :-) >> >> I explicitly added the test with that particular error code >> to make it absolutely clear that puc(4) is not the driver >> for single port cards. The reason being that it's pointless. >> >> There are 2 things that puc(4) facilitates in: resource >> assignment and interrupt handling. For single port cards >> there's nothing to distribute nor is there any interrupt >> sharing. In other words: there's no value that puc(4) adds. >> As such, uart(4) and ppc(4) can attach directly to those >> cards and puc(4) does not have to be involved. >> >> BTW: Traditionally puc(4) was used to attach even to single >> port cards. With the puc(4) rewrite I changed that, because >> it was really a mixed bag. Some single-port cards were known >> to puc(4) others to uart(4)/sio(4) or ppc(4). That typically >> leads to confusion given that puc(4) is (still) not in GENERIC. >> (i.e. why is this UART attached, but that one isn't, they're >> both single port?) >> >> So, please do not apply the patch and instead add the IDs to >> sys/dev/uart/uart_bus_pci.c... > > This sounds fine to me. :) Navdeep, can you develop a patch for > uart(4) > instead and test that? BTW: I forgot to mention that puc(4) needs to back-off from this particular card. That means that the catch-all that we have there needs to be tweaked. So, the change to pucdata.c can still be made, but with a big comment that states that the entry is added only to avoid puc(4) from attaching to that particular 1-port card so that uart(4) can claim it... -- Marcel Moolenaar xcllnt@mac.com From sam at freebsd.org Tue Mar 3 09:37:57 2009 From: sam at freebsd.org (Sam Leffler) Date: Tue Mar 3 09:38:03 2009 Subject: no kern.usercrypto In-Reply-To: <397057.67515.qm@web38302.mail.mud.yahoo.com> References: <397057.67515.qm@web38302.mail.mud.yahoo.com> Message-ID: <49AD6AF3.9060505@freebsd.org> Vasile Marii wrote: > Hello everybody! > I'm a newbie in BSD. > I don't have /dev/crypto nor kern.usercrypto in sysctl...so where can i read something about enabling this on my systems. > man 4 crypto From avg at icyb.net.ua Tue Mar 3 10:15:42 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Tue Mar 3 10:15:50 2009 Subject: ln: posixly confused Message-ID: <49AD73C8.7010500@icyb.net.ua> Test case. Preparation: $ mkdir linktest $ cd linktest $ mkdir some_dir $ mkdir other_dir The test: $ ln -s some_dir the_link $ ln -s -f other_dir the_link Expected: the_link points to other_dir. Actual result: some_dir contains symlink other_dir -> other_dir. >From ln(1): SYNOPSIS ln [-s [-F]] [-f | -iw] [-hnv] source_file [target_file] ln [-s [-F]] [-f | -iw] [-hnv] source_file ... target_dir I thought that only true directory would trigger the second form. I thought that the second argument being a symlink (to a file or to a directory) should trigger the first form. I also read this: http://www.opengroup.org/onlinepubs/009695399/utilities/ln.html I think that the text there (and in ln(1)) implies what I expected, but this is not spelled out clearly. I am confused. -- Andriy Gapon From neldredge at math.ucsd.edu Tue Mar 3 10:32:03 2009 From: neldredge at math.ucsd.edu (Nate Eldredge) Date: Tue Mar 3 10:32:10 2009 Subject: ln: posixly confused In-Reply-To: <49AD73C8.7010500@icyb.net.ua> References: <49AD73C8.7010500@icyb.net.ua> Message-ID: On Tue, 3 Mar 2009, Andriy Gapon wrote: > > Test case. > Preparation: > $ mkdir linktest > $ cd linktest > $ mkdir some_dir > $ mkdir other_dir > The test: > $ ln -s some_dir the_link > $ ln -s -f other_dir the_link > > Expected: the_link points to other_dir. > Actual result: some_dir contains symlink other_dir -> other_dir. > >> From ln(1): > SYNOPSIS > ln [-s [-F]] [-f | -iw] [-hnv] source_file [target_file] > ln [-s [-F]] [-f | -iw] [-hnv] source_file ... target_dir > > I thought that only true directory would trigger the second form. > I thought that the second argument being a symlink (to a file or to a directory) > should trigger the first form. > > I also read this: > http://www.opengroup.org/onlinepubs/009695399/utilities/ln.html > > I think that the text there (and in ln(1)) implies what I expected, but this is > not spelled out clearly. FWIW, Linux and Solaris have the same behavior as FreeBSD. The standard says the second form is triggered if the second argument "names an existing directory". An informative note in the symlink() specification at http://www.opengroup.org/onlinepubs/009695399/functions/symlink.html says "a symbolic link allows a file to have multiple logical names". Therefore, I think it's a fair interpretation to say that a symbolic link to an existing directory "names" it. -- Nate Eldredge neldredge@math.ucsd.edu From avg at icyb.net.ua Tue Mar 3 10:45:53 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Tue Mar 3 10:46:00 2009 Subject: ln: posixly confused In-Reply-To: References: <49AD73C8.7010500@icyb.net.ua> Message-ID: <49AD7A8F.2030802@icyb.net.ua> on 03/03/2009 20:32 Nate Eldredge said the following: > On Tue, 3 Mar 2009, Andriy Gapon wrote: > >> >> Test case. >> Preparation: >> $ mkdir linktest >> $ cd linktest >> $ mkdir some_dir >> $ mkdir other_dir >> The test: >> $ ln -s some_dir the_link >> $ ln -s -f other_dir the_link >> >> Expected: the_link points to other_dir. >> Actual result: some_dir contains symlink other_dir -> other_dir. >> >>> From ln(1): >> SYNOPSIS >> ln [-s [-F]] [-f | -iw] [-hnv] source_file [target_file] >> ln [-s [-F]] [-f | -iw] [-hnv] source_file ... target_dir >> >> I thought that only true directory would trigger the second form. >> I thought that the second argument being a symlink (to a file or to a >> directory) >> should trigger the first form. >> >> I also read this: >> http://www.opengroup.org/onlinepubs/009695399/utilities/ln.html >> >> I think that the text there (and in ln(1)) implies what I expected, >> but this is >> not spelled out clearly. > > FWIW, Linux and Solaris have the same behavior as FreeBSD. > > The standard says the second form is triggered if the second argument > "names an existing directory". An informative note in the symlink() > specification at > http://www.opengroup.org/onlinepubs/009695399/functions/symlink.html > says "a symbolic link allows a file to have multiple logical names". > Therefore, I think it's a fair interpretation to say that a symbolic > link to an existing directory "names" it. Thank you for the info! -- Andriy Gapon From nparhar at gmail.com Tue Mar 3 11:51:49 2009 From: nparhar at gmail.com (Navdeep Parhar) Date: Tue Mar 3 11:51:56 2009 Subject: puc support for a generic card (patch attached) In-Reply-To: <9B775F97-1E5A-4E55-A2AE-26DC78CD08C0@mac.com> References: <200903030915.43037.jhb@freebsd.org> <200903031159.55299.jhb@freebsd.org> <9B775F97-1E5A-4E55-A2AE-26DC78CD08C0@mac.com> Message-ID: <20090303195122.GA30421@insightsol.com> On Tue, Mar 03, 2009 at 09:04:32AM -0800, Marcel Moolenaar wrote: > > On Mar 3, 2009, at 8:59 AM, John Baldwin wrote: > >> On Tuesday 03 March 2009 11:48:42 am Marcel Moolenaar wrote: >>> >>> On Mar 3, 2009, at 6:15 AM, John Baldwin wrote: >>> >>>>> diff -r 025cb00d19d7 sys/dev/puc/puc.c >>>>> --- a/sys/dev/puc/puc.c Sat Feb 28 12:42:37 2009 -0800 >>>>> +++ b/sys/dev/puc/puc.c Mon Mar 02 12:21:07 2009 -0800 >>>>> @@ -440,9 +440,6 @@ >>>>> sc->sc_dev = dev; >>>>> sc->sc_cfg = cfg; >>>>> >>>>> - /* We don't attach to single-port serial cards. */ >>>>> - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) >>>>> - return (EDOOFUS); >>>> >>>> FWIW, the traditional reason for this is that we made the sio/uart >>>> or ppc >>>> drivers claim single port devices directly and only use puc for >>>> multiple-port >>>> cards. I'm not sure if that should still be the case or not. >>>> Marcel, do you >>>> have an opinion? >>> >>> Yes :-) >>> >>> I explicitly added the test with that particular error code >>> to make it absolutely clear that puc(4) is not the driver >>> for single port cards. The reason being that it's pointless. >>> >>> There are 2 things that puc(4) facilitates in: resource >>> assignment and interrupt handling. For single port cards >>> there's nothing to distribute nor is there any interrupt >>> sharing. In other words: there's no value that puc(4) adds. >>> As such, uart(4) and ppc(4) can attach directly to those >>> cards and puc(4) does not have to be involved. >>> >>> BTW: Traditionally puc(4) was used to attach even to single >>> port cards. With the puc(4) rewrite I changed that, because >>> it was really a mixed bag. Some single-port cards were known >>> to puc(4) others to uart(4)/sio(4) or ppc(4). That typically >>> leads to confusion given that puc(4) is (still) not in GENERIC. >>> (i.e. why is this UART attached, but that one isn't, they're >>> both single port?) >>> >>> So, please do not apply the patch and instead add the IDs to >>> sys/dev/uart/uart_bus_pci.c... >> >> This sounds fine to me. :) Navdeep, can you develop a patch for >> uart(4) >> instead and test that? > > BTW: I forgot to mention that puc(4) needs to back-off from this > particular card. That means that the catch-all that we have there > needs to be tweaked. > > So, the change to pucdata.c can still be made, but with a big > comment that states that the entry is added only to avoid puc(4) > from attaching to that particular 1-port card so that uart(4) > can claim it... OK, I'll keep this in mind and will modify the patch to have uart(4) claim the card and puc(4) ignore it. I'll post it once I've tested it. Regards, Navdeep > > -- > Marcel Moolenaar > xcllnt@mac.com > > > From nparhar at gmail.com Tue Mar 3 14:48:27 2009 From: nparhar at gmail.com (Navdeep Parhar) Date: Tue Mar 3 14:48:34 2009 Subject: puc support for a generic card (patch attached) In-Reply-To: <20090303195122.GA30421@insightsol.com> References: <200903030915.43037.jhb@freebsd.org> <200903031159.55299.jhb@freebsd.org> <9B775F97-1E5A-4E55-A2AE-26DC78CD08C0@mac.com> <20090303195122.GA30421@insightsol.com> Message-ID: On Tue, Mar 3, 2009 at 11:51 AM, Navdeep Parhar wrote: > On Tue, Mar 03, 2009 at 09:04:32AM -0800, Marcel Moolenaar wrote: >> >> On Mar 3, 2009, at 8:59 AM, John Baldwin wrote: >> >>> On Tuesday 03 March 2009 11:48:42 am Marcel Moolenaar wrote: >>>> >>>> On Mar 3, 2009, at 6:15 AM, John Baldwin wrote: >>>> >>>>>> diff -r 025cb00d19d7 sys/dev/puc/puc.c >>>>>> --- a/sys/dev/puc/puc.c ? Sat Feb 28 12:42:37 2009 -0800 >>>>>> +++ b/sys/dev/puc/puc.c ? Mon Mar 02 12:21:07 2009 -0800 >>>>>> @@ -440,9 +440,6 @@ >>>>>> ? sc->sc_dev = dev; >>>>>> ? sc->sc_cfg = cfg; >>>>>> >>>>>> - /* We don't attach to single-port serial cards. */ >>>>>> - if (cfg->ports == PUC_PORT_1S || cfg->ports == PUC_PORT_1P) >>>>>> - ? ? ? ? return (EDOOFUS); >>>>> >>>>> FWIW, the traditional reason for this is that we made the sio/uart >>>>> or ppc >>>>> drivers claim single port devices directly and only use puc for >>>>> multiple-port >>>>> cards. ?I'm not sure if that should still be the case or not. >>>>> Marcel, do you >>>>> have an opinion? >>>> >>>> Yes :-) >>>> >>>> I explicitly added the test with that particular error code >>>> to make it absolutely clear that puc(4) is not the driver >>>> for single port cards. The reason being that it's pointless. >>>> >>>> There are 2 things that puc(4) facilitates in: resource >>>> assignment and interrupt handling. For single port cards >>>> there's nothing to distribute nor is there any interrupt >>>> sharing. In other words: there's no value that puc(4) adds. >>>> As such, uart(4) and ppc(4) can attach directly to those >>>> cards and puc(4) does not have to be involved. >>>> >>>> BTW: Traditionally puc(4) was used to attach even to single >>>> port cards. With the puc(4) rewrite I changed that, because >>>> it was really a mixed bag. Some single-port cards were known >>>> to puc(4) others to uart(4)/sio(4) or ppc(4). That typically >>>> leads to confusion given that puc(4) is (still) not in GENERIC. >>>> (i.e. why is this UART attached, but that one isn't, they're >>>> both single port?) >>>> >>>> So, please do not apply the patch and instead add the IDs to >>>> sys/dev/uart/uart_bus_pci.c... >>> >>> This sounds fine to me. :) ?Navdeep, can you develop a patch for >>> uart(4) >>> instead and test that? >> >> BTW: I forgot to mention that puc(4) needs to back-off from this >> particular card. That means that the catch-all that we have there >> needs to be tweaked. >> >> So, the change to pucdata.c can still be made, but with a big >> comment that states that the entry is added only to avoid puc(4) >> from attaching to that particular 1-port card so that uart(4) >> can claim it... > > OK, I'll keep this in mind and will modify the patch to have uart(4) > claim the card and puc(4) ignore it. ?I'll post it once I've tested > it. Reworked patch attached. Works for me. Navdeep > > Regards, > Navdeep > >> >> -- >> Marcel Moolenaar >> xcllnt@mac.com >> >> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: puc.2.patch Type: application/octet-stream Size: 1227 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090303/3875017b/puc.2.obj From matthew.fleming at isilon.com Tue Mar 3 14:49:44 2009 From: matthew.fleming at isilon.com (Matthew Fleming) Date: Tue Mar 3 14:50:22 2009 Subject: knotes Message-ID: <06D5F9F6F655AD4C92E28B662F7F853E0275F4E1@seaxch09.desktop.isilon.com> I am trying to understand the knote system (on 6.1) and I am having some troubles. Specifically, I am confused by the uses of KN_DETACHED and EV_ONESHOT. >From what I can determine from the comments and code, knotes have a filterops member, kn_fop. This among other things has a callback to handle when a note is attached and detached. But e.g. in knlist_clear(), when knlist_remove_kq() removes a knote from the list, it sets KN_DETACHED but does not call the kn_fop->f_detach routine. Then, in the killkn case, KN_DETACHED is set (again). Otherwise, EV_ONESHOT is set, presumably so that kqueue_scan() will run on the knote. However, kqueue_scan() won't call kn_fop->f_detach either because KN_DETACHED is already set. It seems that in knlist_cleardel(), the killkn case should be calling kn_fop->f_detach before knote_drop(). It also seems that the !killkn case should not have KN_DETACHED set, which means that knlist_remove_kq() can't set it. Alternatively, knlist_remove_kq() should be calling kn_fop->f_detach itself before setting KN_DETACHED. But in that case I'm not sure I see why there needs to be a use of EV_ONESHOT. So am I reading this wrong, understanding it wrong, or is there a bug in the code? Thanks, matthew From kostikbel at gmail.com Wed Mar 4 04:35:39 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Mar 4 04:35:46 2009 Subject: knotes In-Reply-To: <06D5F9F6F655AD4C92E28B662F7F853E0275F4E1@seaxch09.desktop.isilon.com> References: <06D5F9F6F655AD4C92E28B662F7F853E0275F4E1@seaxch09.desktop.isilon.com> Message-ID: <20090304123524.GQ41617@deviant.kiev.zoral.com.ua> On Tue, Mar 03, 2009 at 02:49:45PM -0800, Matthew Fleming wrote: > I am trying to understand the knote system (on 6.1) and I am having some > troubles. > > Specifically, I am confused by the uses of KN_DETACHED and EV_ONESHOT. > >From what I can determine from the comments and code, knotes have a > filterops member, kn_fop. This among other things has a callback to > handle when a note is attached and detached. > > But e.g. in knlist_clear(), when knlist_remove_kq() removes a knote from > the list, it sets KN_DETACHED but does not call the kn_fop->f_detach > routine. Then, in the killkn case, KN_DETACHED is set (again). > Otherwise, EV_ONESHOT is set, presumably so that kqueue_scan() will run > on the knote. However, kqueue_scan() won't call kn_fop->f_detach either > because KN_DETACHED is already set. > > It seems that in knlist_cleardel(), the killkn case should be calling > kn_fop->f_detach before knote_drop(). It also seems that the !killkn > case should not have KN_DETACHED set, which means that > knlist_remove_kq() can't set it. Alternatively, knlist_remove_kq() > should be calling kn_fop->f_detach itself before setting KN_DETACHED. > But in that case I'm not sure I see why there needs to be a use of > EV_ONESHOT. > > So am I reading this wrong, understanding it wrong, or is there a bug in > the code? There are two pathes to each knote, one from the kqueue(2) syscall, and another is from the kernel subsystem. My understanding is that f_detach handler is intended to be called from the syscall path only. For instance, pipe destructor sys_pipe.c:pipeclose() calls knlist_clear, i.e. knlist_cleardel with killkn == 0. The detach handler for the pipe removes the knote from the corresponding pipe' knlist. It seems that it is simply wrong to call pipe f_detach from knlist_clear(). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090304/2ea3480e/attachment.pgp From bsd.quest at googlemail.com Wed Mar 4 08:29:40 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Wed Mar 4 08:29:47 2009 Subject: uma_zone Message-ID: <671bb5fc0903040829m7c7ab79ay612868bb4260bd21@mail.gmail.com> how can I get the size and pointer of some allocated uma zone ? For example: zone_pack Thanx Alexej From octavian.covalschi at gmail.com Wed Mar 4 09:42:54 2009 From: octavian.covalschi at gmail.com (Octavian Covalschi) Date: Wed Mar 4 10:18:15 2009 Subject: Spin down HDD after disk sync or before power off Message-ID: Hi everyone. I'm looking a way to spin down HDD just right before power off. Why? Because currently when I call "shutdown -p now", HDD is powered off at it's full speed (7200.4) and as a result I hear a noise of stopping/spinning down of HDD, and _this_ concerns me as I'm afraid it can damage HDD. So basically I want to spin down hdd/put into sleep mode before system is powered off. I've tried to use rc.shutdown, but the sync of disks "wakes" HDD again... While searching for a solution, I noticed that reboot command/app _does_ spin down hdd right before it resets system power, I can hear how HDD is powered on after that... but halt (which is used by shutdown) doesn't do that... 2nd thing is I cannot find "halt.c" file, i wanted to take a look how it does it... although I'm up to date it's not not in /usr/src/sbin Thank you in advance From olli at lurza.secnetix.de Wed Mar 4 11:38:55 2009 From: olli at lurza.secnetix.de (Oliver Fromme) Date: Wed Mar 4 11:39:02 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: Message-ID: <200903041938.n24Jcqdr060153@lurza.secnetix.de> Octavian Covalschi wrote: > I'm looking a way to spin down HDD just right before power off. Why? > > Because currently when I call "shutdown -p now", HDD is powered off at it's > full speed (7200.4) and as a result > I hear a noise of stopping/spinning down of HDD, and _this_ concerns me as > I'm afraid it can damage HDD. You don't have to spin down a disk before powering it off. The noise you hear is probably caused by the "autopark" feature of the drive. It is harmless. > I've tried to use rc.shutdown, but the sync of disks "wakes" HDD again... Of course, upon halt or reboot the kernel will sync all file systems that have been mounted read+write. > While searching for a solution, I noticed that reboot command/app _does_ > spin down hdd right before it resets system power, > I can hear how HDD is powered on after that... No, the reboot command doesn't do that. It's probably your BIOS that resets the devices. > 2nd thing is I cannot find "halt.c" file, i wanted to take a look how it > does it... although I'm up to date it's not not in > /usr/src/sbin halt(8) is a hardlink to reboot(8). Look at src/sbin/reboot. By the way, the syncing does not happen in halt(8). At the time the kernel syncs the disks, no processes are running anymore, not even init(8). You can't do anything from userland at this point. If you want to insert a spin-down for your disks, you will have to modify the kernel. You have to install an event handler for "shutdown_post_sync". See the boot() function in src/sys/kern/kern_shutdown.c for details about the kernel's shutdown sequence. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "Python is an experiment in how much freedom programmers need. Too much freedom and nobody can read another's code; too little and expressiveness is endangered." -- Guido van Rossum From joerg at britannica.bec.de Wed Mar 4 12:15:51 2009 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Wed Mar 4 12:15:59 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <200903041938.n24Jcqdr060153@lurza.secnetix.de> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> Message-ID: <20090304195614.GA179@britannica.bec.de> On Wed, Mar 04, 2009 at 08:38:52PM +0100, Oliver Fromme wrote: > Octavian Covalschi wrote: > > I'm looking a way to spin down HDD just right before power off. Why? > > > > Because currently when I call "shutdown -p now", HDD is powered off at it's > > full speed (7200.4) and as a result > > I hear a noise of stopping/spinning down of HDD, and _this_ concerns me as > > I'm afraid it can damage HDD. > > You don't have to spin down a disk before powering it off. > The noise you hear is probably caused by the "autopark" > feature of the drive. It is harmless. This is not true. Many hard disks don't like having to do an emergency shutdown as it affects the disk life time negatively. That's what happens if you poweroff the machine when the disks are still spinning. Joerg From octavian.covalschi at gmail.com Wed Mar 4 13:17:54 2009 From: octavian.covalschi at gmail.com (Octavian Covalschi) Date: Wed Mar 4 13:18:01 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <200903041938.n24Jcqdr060153@lurza.secnetix.de> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> Message-ID: Thank you for detailed and clear answer! On Wed, Mar 4, 2009 at 1:38 PM, Oliver Fromme wrote: > Octavian Covalschi wrote: > > I'm looking a way to spin down HDD just right before power off. Why? > > > > Because currently when I call "shutdown -p now", HDD is powered off at > it's > > full speed (7200.4) and as a result > > I hear a noise of stopping/spinning down of HDD, and _this_ concerns me > as > > I'm afraid it can damage HDD. > > You don't have to spin down a disk before powering it off. > The noise you hear is probably caused by the "autopark" > feature of the drive. It is harmless. > > > I've tried to use rc.shutdown, but the sync of disks "wakes" HDD > again... > > Of course, upon halt or reboot the kernel will sync all > file systems that have been mounted read+write. > > > While searching for a solution, I noticed that reboot command/app _does_ > > spin down hdd right before it resets system power, > > I can hear how HDD is powered on after that... > > No, the reboot command doesn't do that. It's probably your > BIOS that resets the devices. > > > 2nd thing is I cannot find "halt.c" file, i wanted to take a look how it > > does it... although I'm up to date it's not not in > > /usr/src/sbin > > halt(8) is a hardlink to reboot(8). > Look at src/sbin/reboot. > > By the way, the syncing does not happen in halt(8). At the > time the kernel syncs the disks, no processes are running > anymore, not even init(8). You can't do anything from > userland at this point. If you want to insert a spin-down > for your disks, you will have to modify the kernel. You > have to install an event handler for "shutdown_post_sync". > See the boot() function in src/sys/kern/kern_shutdown.c > for details about the kernel's shutdown sequence. > > Best regards > Oliver > > -- > Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. > Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: > secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- > chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart > > FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd > > "Python is an experiment in how much freedom programmers need. > Too much freedom and nobody can read another's code; too little > and expressiveness is endangered." > -- Guido van Rossum > From dthiele at gmx.net Wed Mar 4 16:52:35 2009 From: dthiele at gmx.net (Daniel Thiele) Date: Wed Mar 4 16:52:42 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <200903041938.n24Jcqdr060153@lurza.secnetix.de> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> Message-ID: <49AF1C1B.3050604@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Oliver Fromme wrote: | Octavian Covalschi wrote: | > I'm looking a way to spin down HDD just right before power off. Why? | > | > Because currently when I call "shutdown -p now", HDD is powered off at it's | > full speed (7200.4) and as a result | > I hear a noise of stopping/spinning down of HDD, and _this_ concerns me as | > I'm afraid it can damage HDD. | [...] | You can't do anything from | userland at this point. If you want to insert a spin-down | for your disks, you will have to modify the kernel. That is what I did and am still doing successfully since 2006. See http://lists.freebsd.org/pipermail/freebsd-acpi/2006-January/002375.html for my initial problem description and http://lists.freebsd.org/pipermail/freebsd-acpi/2006-February/002566.html for the "solution". Note that back then David Tolpin (dvd@davidashen.net) suggested to use " ... & (ATA_SUPPORT_APM|ATA_SUPPORT_STANDBY)" instead. I don't know if that is the way it should be done, but for me it worked across 3 hard disks and two notebooks so far. I am aware that 3 disks and 2 notebooks provide very limited test results, but maybe this work around solves your problem, too. It would still be great, though, if a proper solution for this could be permanently implemented into FreeBSD. That is, if the current behaviour really is not that healthy to hard drives, as Joerg suggested. Best regards, Daniel -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmvHBUACgkQCOZKcWNoXg6LvQCgkT9GGMqa6M/t3hhN9cBM8Fee laQAoNPRvQkk4HkvQYjtVPRsxNZr3Lmn =InHj -----END PGP SIGNATURE----- From olli at lurza.secnetix.de Wed Mar 4 23:58:53 2009 From: olli at lurza.secnetix.de (Oliver Fromme) Date: Wed Mar 4 23:58:59 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090304195614.GA179@britannica.bec.de> Message-ID: <200903050758.n257wod8088426@lurza.secnetix.de> Joerg Sonnenberger wrote: > Oliver Fromme wrote: > > Octavian Covalschi wrote: > > > I'm looking a way to spin down HDD just right before power off. Why? > > > > > > Because currently when I call "shutdown -p now", HDD is powered off at it's > > > full speed (7200.4) and as a result > > > I hear a noise of stopping/spinning down of HDD, and _this_ concerns me as > > > I'm afraid it can damage HDD. > > > > You don't have to spin down a disk before powering it off. > > The noise you hear is probably caused by the "autopark" > > feature of the drive. It is harmless. > > This is not true. Many hard disks don't like having to do an emergency > shutdown as it affects the disk life time negatively. That's what > happens if you poweroff the machine when the disks are still spinning. Can you point to any authoritative information (URL) about that claim, such as vendor specs, white paper or similar? Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd I suggested holding a "Python Object Oriented Programming Seminar", but the acronym was unpopular. -- Joseph Strout From mav at FreeBSD.org Thu Mar 5 01:55:32 2009 From: mav at FreeBSD.org (Alexander Motin) Date: Thu Mar 5 01:55:39 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49AF1C1B.3050604@gmx.net> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <49AF1C1B.3050604@gmx.net> Message-ID: <49AF9381.50709@FreeBSD.org> Daniel Thiele wrote: > Oliver Fromme wrote: > | Octavian Covalschi wrote: > | > I'm looking a way to spin down HDD just right before power off. Why? > | > > | > Because currently when I call "shutdown -p now", HDD is powered off > at it's > | > full speed (7200.4) and as a result > | > I hear a noise of stopping/spinning down of HDD, and _this_ > concerns me as > | > I'm afraid it can damage HDD. I am not sure that there is any problem. Last 10 years drives using electromagnetic head positioning which mechanically parts heads on power down. > | [...] > | You can't do anything from > | userland at this point. If you want to insert a spin-down > | for your disks, you will have to modify the kernel. > > That is what I did and am still doing successfully since 2006. > See > http://lists.freebsd.org/pipermail/freebsd-acpi/2006-January/002375.html > for my initial problem description and > http://lists.freebsd.org/pipermail/freebsd-acpi/2006-February/002566.html > for the "solution". Note that back then David Tolpin > (dvd@davidashen.net) suggested to use > " ... & (ATA_SUPPORT_APM|ATA_SUPPORT_STANDBY)" > instead. > > I don't know if that is the way it should be done, but for me it worked > across 3 hard disks and two notebooks so far. I am aware that 3 disks > and 2 notebooks provide very limited test results, but maybe this work > around solves your problem, too. > > It would still be great, though, if a proper solution for this could be > permanently implemented into FreeBSD. That is, if the current behaviour > really is not that healthy to hard drives, as Joerg suggested. I have thought about doing that on device detach to prepare drive to mechanical shocks in case of drive physical removing. But to work properly it requires some changes in ATA core to be made first to protect against submitting commands to already physically removed drive.. I can agree with doing that on suspend if ACPI does not doing it automatically. But on system shutdown having meaning of reboot, I think, commanding drive IDLE will just lead to additional mechanical and power stress for drive and PSU when drives will be spin-up in just a few seconds after spin-down. -- Alexander Motin From joerg at britannica.bec.de Thu Mar 5 06:25:21 2009 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Thu Mar 5 06:25:28 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <200903050758.n257wod8088426@lurza.secnetix.de> References: <20090304195614.GA179@britannica.bec.de> <200903050758.n257wod8088426@lurza.secnetix.de> Message-ID: <20090305142517.GA2773@britannica.bec.de> On Thu, Mar 05, 2009 at 08:58:50AM +0100, Oliver Fromme wrote: > > This is not true. Many hard disks don't like having to do an emergency > > shutdown as it affects the disk life time negatively. That's what > > happens if you poweroff the machine when the disks are still spinning. > > Can you point to any authoritative information (URL) about > that claim, such as vendor specs, white paper or similar? Not without digging. NetBSD PR 21531 had a reference, but that is dead nowadays. Joerg From octavian.covalschi at gmail.com Thu Mar 5 06:43:33 2009 From: octavian.covalschi at gmail.com (Octavian Covalschi) Date: Thu Mar 5 06:43:40 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090305142517.GA2773@britannica.bec.de> References: <20090304195614.GA179@britannica.bec.de> <200903050758.n257wod8088426@lurza.secnetix.de> <20090305142517.GA2773@britannica.bec.de> Message-ID: I think just HDD parking won't harm HDD, as my prev HDD was autoparking constantly (safe or powersave reasons). However in my case, I can hear a sound that is made by a vibration, it's like plates are slowing down and something is vibrating (arm?). It's kind the same with desktop PC, when your power is cut and your HDD are stopped too sudden. Thanks. On Thu, Mar 5, 2009 at 8:25 AM, Joerg Sonnenberger wrote: > On Thu, Mar 05, 2009 at 08:58:50AM +0100, Oliver Fromme wrote: > > > This is not true. Many hard disks don't like having to do an emergency > > > shutdown as it affects the disk life time negatively. That's what > > > happens if you poweroff the machine when the disks are still spinning. > > > > Can you point to any authoritative information (URL) about > > that claim, such as vendor specs, white paper or similar? > > Not without digging. NetBSD PR 21531 had a reference, but that is dead > nowadays. > > Joerg > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From abitos at abitos.org Thu Mar 5 11:18:11 2009 From: abitos at abitos.org (Tobias Blersch) Date: Thu Mar 5 11:18:18 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <200903050758.n257wod8088426@lurza.secnetix.de> References: <200903050758.n257wod8088426@lurza.secnetix.de> Message-ID: <49B02211.1010809@abitos.org> Oliver Fromme wrote: > > Joerg Sonnenberger wrote: > > This is not true. Many hard disks don't like having to do an emergency > > shutdown as it affects the disk life time negatively. That's what > > happens if you poweroff the machine when the disks are still spinning. > > Can you point to any authoritative information (URL) about > that claim, such as vendor specs, white paper or similar? http://www.hitachigst.com/tech/techlib.nsf/techdocs/28DCCB17E0EEC5A086256F4E006E2F5B Thats the specification for my notebooks hard drive. Section 6.6 Reliability gives data about how to power-off the disk. It also contains numbers of supported load/unloads and emergency unloads. Emergency unloads are invoked when the heads are still loaded and power fails. > The product supports a minimum of 600,000 normal load/unloads. > The drive supports a minimum of 20,000 emergency unloads. Tobias From neldredge at math.ucsd.edu Thu Mar 5 11:49:11 2009 From: neldredge at math.ucsd.edu (Nate Eldredge) Date: Thu Mar 5 11:49:18 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49B02211.1010809@abitos.org> References: <200903050758.n257wod8088426@lurza.secnetix.de> <49B02211.1010809@abitos.org> Message-ID: On Thu, 5 Mar 2009, Tobias Blersch wrote: > Oliver Fromme wrote: >> > Joerg Sonnenberger wrote: >> > This is not true. Many hard disks don't like having to do an emergency >> > shutdown as it affects the disk life time negatively. That's what >> > happens if you poweroff the machine when the disks are still spinning. >> >> Can you point to any authoritative information (URL) about >> that claim, such as vendor specs, white paper or similar? > > http://www.hitachigst.com/tech/techlib.nsf/techdocs/28DCCB17E0EEC5A086256F4E006E2F5B > > Thats the specification for my notebooks hard drive. Section 6.6 > Reliability gives data about how to power-off the disk. It also contains > numbers of supported load/unloads and emergency unloads. Emergency > unloads are invoked when the heads are still loaded and power fails. Ok, I didn't know that. There are some drives that can unload the heads normally on power loss and don't need any special handling, and I was under the mistaken impression that this was universal. But the documentation suggests that this should be a BIOS function. When the kernel tries to poweroff the system, isn't that normally done via the BIOS (perhaps with ACPI/APM)? So maybe the BIOS is supposed to unload the heads (by sending a standby/sleep command) before cutting the power. This makes sense in some ways. Suppose the drive is attached to a weird ATA controller that FreeBSD doesn't know anything about. (Maybe it's used by the other system in a dual-boot setup.) There's no way that FreeBSD could send it a power-down sequence, but the BIOS could. Perhaps the OP's BIOS for some reason doesn't do this correctly. -- Nate Eldredge neldredge@math.ucsd.edu From joerg at britannica.bec.de Thu Mar 5 12:07:11 2009 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Thu Mar 5 12:07:18 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: References: <200903050758.n257wod8088426@lurza.secnetix.de> <49B02211.1010809@abitos.org> Message-ID: <20090305200706.GA1790@britannica.bec.de> On Thu, Mar 05, 2009 at 11:49:10AM -0800, Nate Eldredge wrote: > This makes sense in some ways. Suppose the drive is attached to a weird > ATA controller that FreeBSD doesn't know anything about. (Maybe it's > used by the other system in a dual-boot setup.) There's no way that > FreeBSD could send it a power-down sequence, but the BIOS could. As long as you can send a ATA command directly to the disk, you can spin it down. Joerg From dthiele at gmx.net Thu Mar 5 12:37:21 2009 From: dthiele at gmx.net (Daniel Thiele) Date: Thu Mar 5 12:37:29 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: References: <200903050758.n257wod8088426@lurza.secnetix.de> <49B02211.1010809@abitos.org> Message-ID: <49B037F6.3080001@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Nate Eldredge wrote: | On Thu, 5 Mar 2009, Tobias Blersch wrote: | |> Oliver Fromme wrote: |>> > Joerg Sonnenberger wrote: |>> > This is not true. Many hard disks don't like having to do an emergency |>> > shutdown as it affects the disk life time negatively. That's what |>> > happens if you poweroff the machine when the disks are still spinning. |>> |>> Can you point to any authoritative information (URL) about |>> that claim, such as vendor specs, white paper or similar? |> |> http://www.hitachigst.com/tech/techlib.nsf/techdocs/28DCCB17E0EEC5A086256F4E006E2F5B |> |> |> Thats the specification for my notebooks hard drive. Section 6.6 |> Reliability gives data about how to power-off the disk. It also contains |> numbers of supported load/unloads and emergency unloads. Emergency |> unloads are invoked when the heads are still loaded and power fails. | | Ok, I didn't know that. There are some drives that can unload the heads | normally on power loss and don't need any special handling, and I was | under the mistaken impression that this was universal. | | But the documentation suggests that this should be a BIOS function. | When the kernel tries to poweroff the system, isn't that normally done | via the BIOS (perhaps with ACPI/APM)? So maybe the BIOS is supposed to | unload the heads (by sending a standby/sleep command) before cutting the | power. | Interestingly, the specification for the Hitachi drive in my notebook (a TravelStar 5K320) "Travelstar 5K320 Specification - HTSxxx models v1.0" avilable at http://www.hitachigst.com/tech/techlib.nsf/products/Travelstar_5K320 says in the paragraph "Required power-off sequence": "The required host system sequence for removing power from the drive is as follows [...]" whereas the TravelStar 5K100 specifications lists exactly the same steps but states that it is the BIOS' job to take care of executing them. | Perhaps the OP's BIOS for some reason doesn't do this correctly. I tried this on 2 different notebooks (2 ThinkPads though) and on both machines the disks make a very audible "click" sound when I "shutdown -p now" FreeBSD (6.x - CURRENT). With the patch I mentioned in my other reply, however, the disks seem to power-off more smoothly. On a Samsung X20 notebook I observed a comparable behavior. So I am not sure if it is just one badly implemented function in very few number of BIOSes or something that the operating system is supposed to take care of or should at least try to. Just for comparison: The original Windows XP on the Samsung X20 powers-off the disk in a "smooth" way, too. Unfortunately, I haven't had the time to test this with other operating systems. Best regards, Daniel -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmwN/IACgkQCOZKcWNoXg6/fgCdFZkpy9Muz7BBw7VPqBVOcfr8 nPIAoLZ+S3aT19nW0jNhk9r41f/IC/rL =7wPG -----END PGP SIGNATURE----- From dthiele at gmx.net Thu Mar 5 13:22:01 2009 From: dthiele at gmx.net (Daniel Thiele) Date: Thu Mar 5 13:22:08 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49AF9381.50709@FreeBSD.org> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <49AF1C1B.3050604@gmx.net> <49AF9381.50709@FreeBSD.org> Message-ID: <49B04281.2030406@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alexander Motin wrote: | Daniel Thiele wrote: |> Oliver Fromme wrote: |> | Octavian Covalschi wrote: |> | > I'm looking a way to spin down HDD just right before power off. Why? |> | > |> | > Because currently when I call "shutdown -p now", HDD is powered off |> at it's |> | > full speed (7200.4) and as a result |> | > I hear a noise of stopping/spinning down of HDD, and _this_ |> concerns me as |> | > I'm afraid it can damage HDD. | | I am not sure that there is any problem. Last 10 years drives using | electromagnetic head positioning which mechanically parts heads on power | down. | |> | [...] |> | You can't do anything from |> | userland at this point. If you want to insert a spin-down |> | for your disks, you will have to modify the kernel. |> |> That is what I did and am still doing successfully since 2006. |> See |> http://lists.freebsd.org/pipermail/freebsd-acpi/2006-January/002375.html |> for my initial problem description and |> http://lists.freebsd.org/pipermail/freebsd-acpi/2006-February/002566.html |> for the "solution". Note that back then David Tolpin |> (dvd@davidashen.net) suggested to use |> " ... & (ATA_SUPPORT_APM|ATA_SUPPORT_STANDBY)" |> instead. |> |> I don't know if that is the way it should be done, but for me it worked |> across 3 hard disks and two notebooks so far. I am aware that 3 disks |> and 2 notebooks provide very limited test results, but maybe this work |> around solves your problem, too. |> |> It would still be great, though, if a proper solution for this could be |> permanently implemented into FreeBSD. That is, if the current behaviour |> really is not that healthy to hard drives, as Joerg suggested. | | I have thought about doing that on device detach to prepare drive to | mechanical shocks in case of drive physical removing. But to work | properly it requires some changes in ATA core to be made first to | protect against submitting commands to already physically removed drive.. | | I can agree with doing that on suspend if ACPI does not doing it | automatically. | | But on system shutdown having meaning of reboot, I think, commanding | drive IDLE will just lead to additional mechanical and power stress for | drive and PSU when drives will be spin-up in just a few seconds after | spin-down. | On reboot I do not observe the drives "click" noise, but the drive, too, gets powered off (without any patches). So I think that the ad-hoc fix I mentioned above does not introduce any additional stress to the drive on reboot or am I missing something? Looking at the numbers in the Hitachi drive specifications Tobias an I dug out from the Hitachi website (see replies in the Joerg Sonnenberger branch of this thread) the normal Load/Unload count is about 30 times higher than the Emergency Unload count. So even if an ATA_STANDBY_IMMEDIATE command may introduce additional Load/Unload stress on reboot it is not as bad as the stress causes by an Emergency Unload on shutdown. Of course this only applies if the "click" sound is really caused by an Emergency Unload. Is there a way to figure out? Maybe the S.M.A.R.T. feature records the two kinds of power-offs. Additionally, the Hitachi TravleStar 5K320 specification states that it is the host systems job to execute the drives proper power-off sequence. The TravelStar 5K100 specification, on the other hand, mentiones that the BIOS is responsible for that. Best regards, Daniel -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmwQn8ACgkQCOZKcWNoXg41fACfYt9eYJkL6mYdKFXeiyo4pnZf OfoAnimxxjoKFxzuWx3/NHOvecRxjkhx =Cl7N -----END PGP SIGNATURE----- From ed at 80386.nl Thu Mar 5 13:28:09 2009 From: ed at 80386.nl (Ed Schouten) Date: Thu Mar 5 13:28:16 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49B04281.2030406@gmx.net> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <49AF1C1B.3050604@gmx.net> <49AF9381.50709@FreeBSD.org> <49B04281.2030406@gmx.net> Message-ID: <20090305212807.GC19161@hoeg.nl> * Daniel Thiele wrote: > Looking at the numbers in the Hitachi drive specifications Tobias an I > dug out from the Hitachi website (see replies in the Joerg Sonnenberger > branch of this thread) the normal Load/Unload count is about 30 times > higher than the Emergency Unload count. Have you also looked at the definition of `emergency unload'? Maybe this number doesn't actually refer to the number of unloads caused by power loss, but because they detect a very high amount of vibration. But I'm not a hard disk expert. -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090305/8f5fcc9c/attachment.pgp From dthiele at gmx.net Thu Mar 5 13:45:40 2009 From: dthiele at gmx.net (Daniel Thiele) Date: Thu Mar 5 13:45:56 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090305212807.GC19161@hoeg.nl> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <49AF1C1B.3050604@gmx.net> <49AF9381.50709@FreeBSD.org> <49B04281.2030406@gmx.net> <20090305212807.GC19161@hoeg.nl> Message-ID: <49B0480A.3090909@gmx.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ed Schouten wrote: | * Daniel Thiele wrote: |> Looking at the numbers in the Hitachi drive specifications Tobias an I |> dug out from the Hitachi website (see replies in the Joerg Sonnenberger |> branch of this thread) the normal Load/Unload count is about 30 times |> higher than the Emergency Unload count. | | Have you also looked at the definition of `emergency unload'? Maybe this | number doesn't actually refer to the number of unloads caused by power | loss, but because they detect a very high amount of vibration. But I'm | not a hard disk expert. | I am no disk expert either. The Hitachi TravelStar 5K320 specification says on this topic: 6.3.6.1 Emergency unload When hard disk drive power is interrupted while the heads are still loaded the micro code cannot operate and the normal 5 -volt power is unavailable to unload the heads. In this case, normal unload is not possible. The heads are unloaded by routing the back EMF of the spinning motor to the voice coil. The actuator velocity is greater than the normal case and the unload process is inherently less controllable without a normal seek current profile. Emergency unload is intended to be invoked in rare situations. Because this operation is inherently uncontrolled, it is more mechanically stressful than a normal unload. So it seems to be a kind of self protection mechanism that electro-mechanically tries to get the heads in a save position without firmware or microcode intervention in the case of a sudden power outage. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmwSAIACgkQCOZKcWNoXg7vlQCgzcSvK25cLBfemmsC7/xXmtcl /7kAmwQGM5xFVjZJW7YGqNWaWIXuXqcu =cPcz -----END PGP SIGNATURE----- From psteele at maxiscale.com Thu Mar 5 22:54:07 2009 From: psteele at maxiscale.com (Peter Steele) Date: Thu Mar 5 22:54:14 2009 Subject: How to tear down a geom mirror? In-Reply-To: <17738942.121236320716364.JavaMail.HALO$@halo> Message-ID: <17349951.141236320867093.JavaMail.HALO$@halo> I posed this question in the questions list but didn't get any traction. Hopefully someone here will have an answer. I've created a USB boot disk that is used to clone itself onto the systems hard drives, setting up mirrored file systems in the process. The main difficulty I'm having is reimaging a system with an existing OS whose drives are already configured in a mirror. I want of course to destroy the mirror and create a complete new one, but I can't find the right process to accomplish this reliably. I don't want to make any assumptions about what mirrors might exist already and I definitely don't want to do "gmirror load" before I get a chance to destroy any existing mirrors. What I am doing is to clean the drive using dd. For example, assume my target system has two drives ad1 and ad2. I issue the following commands: dd if=/dev/zero of=/dev/ad1 bs=512 count=79 dd if=/dev/zero of=/dev/ad2 bs=512 count=79 I'm assuming this is enough to destroy any existing mirrors on the target drives, and I do this before the geom driver is loaded. After this, I partition the drives as I want them, and then create the mirrored pair: gmirror load gmirror label -v -n -b round-robin gm0 ad1s1 gmirror insert gm0 ad2s1 This process works exactly as I want it if the system that is being reimaged has existing mirrors. However, if the drives were previously participating in a mirror, the label command fails, reporting the following error: gmirror: Can't store metadata on ad1s1: Operation not permitted. If I make sure the existing mirrors are torn down first doing an "remove" operation instead of using the dd method, this can solve the problem, but in some cases the mirror on the target system is in a suspect state and I've seen the "gmirror load" command hang idefiinitely. So I don't want to do a load command before I destroy the old mirrors, but I can't seem to find a way to reliably destroy the old mirrors. Can anyone suggest a way to do this? From guru at unixarea.de Fri Mar 6 02:32:39 2009 From: guru at unixarea.de (Matthias Apitz) Date: Fri Mar 6 02:32:47 2009 Subject: Fwd: associated to AP (WEP mode) && no IP addr via DHCP Message-ID: <20090306064633.GA2603@rebelion.Sisis.de> Hello, I've posted the problem below to freebsd-mobile with zero (visible) effect; maybe someone from freebsd-hackers has at least an idea for me where to look into for further debugging; it should to stay that a simple stupid Nokia works in a Wifi zone, while FreeBSD does not :-) Thx for your time reading my problem matthias ----- Forwarded message from Matthias Apitz ----- From: Matthias Apitz Date: Thu, 5 Mar 2009 13:00:22 +0100 To: freebsd-mobile@freebsd.org Subject: associated to AP (WEP mode) && no IP addr via DHCP Hello, I'm going frequently to a Greek restaurant in my town to have dinner there or some red wine, and reading stuff; the owner of the restaurant has a Wifi zone and gave me, as its best client, the WEP-key to connect to Internet; the problem is that he does not have the admin password of the AP (some else configured it) and so I can't have a look into the config of the AP; my /etc/wpa_supplicant.conf for the AP is: # Restaurante Odyssey (2007-11-18) # network={ ssid="ConnectionPoint" scan_ssid=0 key_mgmt=NONE wep_tx_keyidx=0 wep_key0=xxxxxxxxxxxxxxxxxxxxxx } and the interface associates fine: # ifconfig ath0 ath0: flags=8843 metric 0 mtu 1500 ether 00:15:af:b2:ae:e6 inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255 media: IEEE 802.11 Wireless Ethernet autoselect (OFDM/36Mbps) status: associated ssid ConnectionPoint channel 11 (2462 Mhz 11g) bssid 00:01:e3:0e:97:99 authmode OPEN privacy ON deftxkey 1 wepkey 1:104-bit txpower 31.5 bmiss 7 scanvalid 60 protmode CTS burst roaming MANUAL but a DHCP request does not give me any IP addr; with # tcpdump -n -i ath0 it says: 19:01:01.603869 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:15:af:b2:ae:e6, length 300 19:01:02.036549 00:01:e3:0e:97:98 Unknown SSAP 0x2c > ff:ff:ff:ff:ff:ff Unknown DSAP 0xa2 Information, send seq 98, rcv seq 39, Flags [Command], length 36 19:01:02.958057 00:01:e3:0e:97:98 ProWay NM > ff:ff:ff:ff:ff:ff Unknown DSAP 0x5c Supervisory, Reject, rcv seq 8, Flags [Response], length 36 19:01:04.186892 00:01:e3:0e:97:98 Unknown SSAP 0xbe > ff:ff:ff:ff:ff:ff Unknown DSAP 0x44 Supervisory, Reject, rcv seq 39, Flags [Final], length 36 19:01:09.606218 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:15:af:b2:ae:e6, length 300 this situation is already for monthes and I gave up and always use UMTS if I want connect to Internet; until yesterday I was thinking in some kind of MAC addr filter in the AP, but .... yesterday I was there with a friend who has a Nokia mobile E51 device; I gave him the key, he associated like me and got in the next second IP, DNS and all was fine; what is that for a problem? it is not ath0 related because my other laptop with iwi0 does not get IP either; what can I provide as information to nail this down? Thx matthias -- Matthias Apitz Manager Technical Support - OCLC GmbH Gruenwalder Weg 28g - 82041 Oberhaching - Germany t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211 e - w http://www.oclc.org/ http://www.UnixArea.de/ _______________________________________________ freebsd-mobile@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-mobile To unsubscribe, send any mail to "freebsd-mobile-unsubscribe@freebsd.org" ----- End forwarded message ----- -- Matthias Apitz Manager Technical Support - OCLC GmbH Gruenwalder Weg 28g - 82041 Oberhaching - Germany t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211 e - w http://www.oclc.org/ http://www.UnixArea.de/ From ticso at cicely7.cicely.de Fri Mar 6 03:03:08 2009 From: ticso at cicely7.cicely.de (Bernd Walter) Date: Fri Mar 6 03:03:15 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49B037F6.3080001@gmx.net> References: <200903050758.n257wod8088426@lurza.secnetix.de> <49B02211.1010809@abitos.org> <49B037F6.3080001@gmx.net> Message-ID: <20090306110259.GG64172@cicely7.cicely.de> On Thu, Mar 05, 2009 at 09:37:10PM +0100, Daniel Thiele wrote: > > "Travelstar 5K320 Specification - HTSxxx models v1.0" avilable at > http://www.hitachigst.com/tech/techlib.nsf/products/Travelstar_5K320 > > says in the paragraph "Required power-off sequence": "The required host > system sequence for removing power from the drive is as follows [...]" > whereas the TravelStar 5K100 specifications lists exactly the same steps > but states that it is the BIOS' job to take care of executing them. > > | Perhaps the OP's BIOS for some reason doesn't do this correctly. The BIOS can only do this for known drives. There is always the chance that the kernel knows more drives than the BIOS, since usually people (including me) don't bother to tell the BIOS about more than the boot drive. Also FreeBSD had most recently used the ata controllers and it might be left in a mode, which can't be taken over by the BIOS. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From octavian.covalschi at gmail.com Fri Mar 6 05:40:12 2009 From: octavian.covalschi at gmail.com (Octavian Covalschi) Date: Fri Mar 6 05:40:19 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49B04B19.5010100@gmx.net> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <49AF1C1B.3050604@gmx.net> <49AF2CCA.2080706@gmx.net> <49B04563.3010002@gmx.net> <49B04B19.5010100@gmx.net> Message-ID: I played more with this, and got here so far: if (atadev->param.support.command1 & ATA_SUPPORT_STANDBY) { device_printf(dev, "Trying to spindown before poweroff.\n"); atadev->spindown = 1; ad_spindown((void *)dev); } else { device_printf(dev, "Cannot spindown before poweroff.\n"); } for some reason this check works on my laptop: if (atadev->param.support.command1 & ATA_SUPPORT_STANDBY) instead of if (atadev->param.support.command2 & ATA_SUPPORT_STANDBY) command1 vs command2 I'm using 7.1-STABLE... By the way, does anyone know why ad_shutdown is _not_ called at poweroff? Apparently it's called only at halt & reboot... Still looking... PS: I think last post didn't get to entire mail list, so trying to send it again. On Thu, Mar 5, 2009 at 3:58 PM, Daniel Thiele wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Octavian Covalschi wrote: > | I tried your patch 1st, but it didn't work for me for shutdown, although > I > | didn't try it with halt (I assumed they work the same). > | > | While I was looking into that, I've discovered "ad_spindown" function, > and > | tried to use it, and as I said, it works (for me at least), but only with > | halt. > | > > Hmm, here is what David Tolpin mentioned back in 2006 when he replied > (privately) to the fist fix I came up with: > > "Besides, I had to increase timeout in ata-queue for controlcmd " > > I am no kernel expert, so I am not quite sure how to incorporate his > suggestion, but may be this helps with your problem. > > I would be interested in your progress on this topic, for maybe some day > one of my machines will refuse to spin down the disk with the "simple" > patch, too. > > | > | On Thu, Mar 5, 2009 at 3:34 PM, Daniel Thiele wrote: > | > | Octavian Covalschi wrote: > | | OK. > | | > | | After several _kernel_ recompilations (by the end I found out that I > | can use > | | -DNO_KERNELCLEAN ) I've got some results. > | | > | | As i found out that ata-disk.c already has ad_spindown function, witch > I > | | tried to use, so after small changes I have: > | | > | | static void > | | ad_shutdown(device_t dev) > | | { > | | struct ata_device *atadev = device_get_softc(dev); > | | > | | if (atadev->param.support.command2 & ATA_SUPPORT_FLUSHCACHE) > | | ata_controlcmd(dev, ATA_FLUSHCACHE, 0, 0, 0); > | | > | | /* start */ > | | device_printf(dev, "Forced spindown\n"); > | | atadev->spindown = 1; > | | ad_spindown((void *)dev); > | | /* end */ > | | } > | | > | | But for some reason this works only with Halt or shutdown -h now, on > | | shutdown -p it even doesn't get inside ad_shutdown. > | | Well at least I have this :) > | | > | > | Does putting > | > | if (atadev->param.support.command2 & > (ATA_SUPPORT_APM|ATA_SUPPORT_STANDBY)) > | ~ ata_controlcmd(dev, ATA_STANDBY_IMMEDIATE, 0, 0, 0); > | > | directly into ad_shutdown() work? > | > | About your gmirror question: Unfortunately I never used gmirror together > | with the spindown-hack, but I (as a just layperson on this topic(!)) do > | not see any reason why this could cause a problem, since ad_shutdown() > | is most likely called after the disks got unmounted and after GEOM is > | done with them. > | > | Best regards, > | > | Daniel > |> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.11 (FreeBSD) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEARECAAYFAkmwSxUACgkQCOZKcWNoXg5QBQCcDADmK8RrIduZCAY6IksuHSNm > disAnRUjx6SgGUPghw+/X9uf5oFFdEs/ > =xmQO > -----END PGP SIGNATURE----- > From jilles at stack.nl Fri Mar 6 06:09:07 2009 From: jilles at stack.nl (Jilles Tjoelker) Date: Fri Mar 6 06:09:13 2009 Subject: How to tear down a geom mirror? In-Reply-To: <17349951.141236320867093.JavaMail.HALO$@halo> References: <17738942.121236320716364.JavaMail.HALO$@halo> <17349951.141236320867093.JavaMail.HALO$@halo> Message-ID: <20090306140850.GA62926@stack.nl> On Thu, Mar 05, 2009 at 10:27:50PM -0800, Peter Steele wrote: > I've created a USB boot disk that is used to clone itself onto the > systems hard drives, setting up mirrored file systems in the process. > The main difficulty I'm having is reimaging a system with an existing > OS whose drives are already configured in a mirror. I want of course > to destroy the mirror and create a complete new one, but I can't find > the right process to accomplish this reliably. I don't want to make > any assumptions about what mirrors might exist already and I > definitely don't want to do "gmirror load" before I get a chance to > destroy any existing mirrors. > What I am doing is to clean the drive using dd. For example, assume my > target system has two drives ad1 and ad2. I issue the following > commands: > dd if=/dev/zero of=/dev/ad1 bs=512 count=79 > dd if=/dev/zero of=/dev/ad2 bs=512 count=79 gmirror and various other geom modules store their metadata on the last sector(s) of the drive, so you need to wipe that too. -- Jilles Tjoelker From ivoras at freebsd.org Fri Mar 6 06:17:16 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Mar 6 06:17:23 2009 Subject: How to tear down a geom mirror? In-Reply-To: <20090306140850.GA62926@stack.nl> References: <17738942.121236320716364.JavaMail.HALO$@halo> <17349951.141236320867093.JavaMail.HALO$@halo> <20090306140850.GA62926@stack.nl> Message-ID: Jilles Tjoelker wrote: > On Thu, Mar 05, 2009 at 10:27:50PM -0800, Peter Steele wrote: >> I've created a USB boot disk that is used to clone itself onto the >> systems hard drives, setting up mirrored file systems in the process. >> The main difficulty I'm having is reimaging a system with an existing >> OS whose drives are already configured in a mirror. I want of course >> to destroy the mirror and create a complete new one, but I can't find >> the right process to accomplish this reliably. I don't want to make >> any assumptions about what mirrors might exist already and I >> definitely don't want to do "gmirror load" before I get a chance to >> destroy any existing mirrors. > >> What I am doing is to clean the drive using dd. For example, assume my >> target system has two drives ad1 and ad2. I issue the following >> commands: > >> dd if=/dev/zero of=/dev/ad1 bs=512 count=79 >> dd if=/dev/zero of=/dev/ad2 bs=512 count=79 > > gmirror and various other geom modules store their metadata on the last > sector(s) of the drive, so you need to wipe that too. Or simply use the "clean" command, for example "gmirror clean" (also supported in other GEOM classes). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090306/97e7ce6d/signature.pgp From psteele at maxiscale.com Fri Mar 6 06:21:07 2009 From: psteele at maxiscale.com (Peter Steele) Date: Fri Mar 6 06:21:15 2009 Subject: How to tear down a geom mirror? In-Reply-To: <26247649.201236349212696.JavaMail.HALO$@halo> Message-ID: <22091257.221236349245872.JavaMail.HALO$@halo> > Or simply use the "clean" command, for example "gmirror clean" (also >supported in other GEOM classes). Can I do a gmirror clean without first doing a gmirror load? That's what I want to avoid since it can hang if the mirror is is a bad state. From psteele at maxiscale.com Fri Mar 6 06:27:12 2009 From: psteele at maxiscale.com (Peter Steele) Date: Fri Mar 6 06:27:19 2009 Subject: How to tear down a geom mirror? In-Reply-To: <20090306140850.GA62926@stack.nl> Message-ID: <4159914.261236349612218.JavaMail.HALO$@halo> >gmirror and various other geom modules store their metadata on the last >sector(s) of the drive, so you need to wipe that too. In our case the systems we are using aren't mirroring the whole drive, just certain slices. Some systems have a single slice mirrored (plus an unmirrored slice), and others have two slices mirrored (plus a third unmirrored slice). I need a way to destroy the existing mirrors, without doing a gmirror load, and ultimately without making any assumptions about the number or condition of mirrored slices on the drives I am about to install a new OS onto. From ivoras at freebsd.org Fri Mar 6 06:59:03 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Mar 6 06:59:10 2009 Subject: How to tear down a geom mirror? In-Reply-To: <22091257.221236349245872.JavaMail.HALO$@halo> References: <26247649.201236349212696.JavaMail.HALO$@halo> <22091257.221236349245872.JavaMail.HALO$@halo> Message-ID: Peter Steele wrote: >> Or simply use the "clean" command, for example "gmirror clean" (also >> supported in other GEOM classes). > > Can I do a gmirror clean without first doing a gmirror load? That's what I want to avoid since it can hang if the mirror is is a bad state. Sorry, the actual command is "clear", not "clean". Yes. The "clear" commands usually just zero-out the last sector of the underlying provider (doesn't matter if it's a drive, slice or something altogether different) so you don't have to do it manually. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090306/e5ebc69d/signature.pgp From psteele at maxiscale.com Fri Mar 6 07:26:01 2009 From: psteele at maxiscale.com (Peter Steele) Date: Fri Mar 6 07:26:08 2009 Subject: How to tear down a geom mirror? In-Reply-To: Message-ID: <27198392.361236353137467.JavaMail.HALO$@halo> >Yes. The "clear" commands usually just zero-out the last sector of the >underlying provider (doesn't matter if it's a drive, slice or something >altogether different) so you don't have to do it manually. So, as a generic solution then I could just iterate through all slices of all drives and run "gmirror clear" on each, and run dd to clear the first sectors. What btw is in these first sectors? I use this command because I saw it being done in one of the gmirror tutorials. I understand what the gmirror clear command does, but what is the dd command clearing? From bsd.quest at googlemail.com Fri Mar 6 08:13:40 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Fri Mar 6 08:13:47 2009 Subject: wrong data in remapped buffer Message-ID: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> Hello, I try to MALLOC a buffer in kern, then remap it with vm_map_find(), to space of user process. Some times the remapped buffer in user space contain incorrect data. What could be a reason of this problem and how to solve it ? Thanx, Alexej P.S. Whole code of remapping function: http://pastebin.com/m78da0b37 From olli at lurza.secnetix.de Fri Mar 6 11:15:13 2009 From: olli at lurza.secnetix.de (Oliver Fromme) Date: Fri Mar 6 11:15:20 2009 Subject: How to tear down a geom mirror? In-Reply-To: Message-ID: <200903061915.n26JFBre071274@lurza.secnetix.de> Peter Steele wrote: > > Yes. The "clear" commands usually just zero-out the last sector of the > > underlying provider (doesn't matter if it's a drive, slice or something > > altogether different) so you don't have to do it manually. > > So, as a generic solution then I could just iterate through all > slices of all drives and run "gmirror clear" on each, and run dd > to clear the first sectors. What btw is in these first sectors? I > use this command because I saw it being done in one of the gmirror > tutorials. I understand what the gmirror clear command does, but what > is the dd command clearing? It clears the MBR (slice table) and GPT or disklabel (partition table), if any. Depending on how many sectors you clear, it will also destroy the beginning the file system, e.g. the first UFS superblock. By the way, if you cannot use "gmirror clear" for any reason, you can also easily clear the last sector on any devices using the information from diskinfo. For example: DEV=/dev/ad0s1a set -- $(diskinfo $DEV) BLOCKSIZE=$2 MEDIASIZE=$4 LASTSEC=$(( $MEDIASIZE - 1 )) dd if=/dev/zero of=$DEV bs=$BLOCKSIZE seek=$(( $MEDIASIZE - 1 )) count=1 That's pretty much what "gmirror clear /dev/ad0s1a" does. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs." -- Robert Firth From psteele at maxiscale.com Fri Mar 6 11:59:36 2009 From: psteele at maxiscale.com (Peter Steele) Date: Fri Mar 6 11:59:43 2009 Subject: How to tear down a geom mirror? In-Reply-To: <200903061915.n26JFBre071274@lurza.secnetix.de> Message-ID: <21286486.691236369552766.JavaMail.HALO$@halo> Okay, thanks everyone for their feedback. I think I have a workable solution now. Peter ----- Original Message ----- From: "Oliver Fromme" To: freebsd-hackers@FreeBSD.ORG, psteele@maxiscale.com Sent: Friday, March 6, 2009 11:15:11 AM GMT -08:00 US/Canada Pacific Subject: Re: How to tear down a geom mirror? Peter Steele wrote: > > Yes. The "clear" commands usually just zero-out the last sector of the > > underlying provider (doesn't matter if it's a drive, slice or something > > altogether different) so you don't have to do it manually. > > So, as a generic solution then I could just iterate through all > slices of all drives and run "gmirror clear" on each, and run dd > to clear the first sectors. What btw is in these first sectors? I > use this command because I saw it being done in one of the gmirror > tutorials. I understand what the gmirror clear command does, but what > is the dd command clearing? It clears the MBR (slice table) and GPT or disklabel (partition table), if any. Depending on how many sectors you clear, it will also destroy the beginning the file system, e.g. the first UFS superblock. By the way, if you cannot use "gmirror clear" for any reason, you can also easily clear the last sector on any devices using the information from diskinfo. For example: DEV=/dev/ad0s1a set -- $(diskinfo $DEV) BLOCKSIZE=$2 MEDIASIZE=$4 LASTSEC=$(( $MEDIASIZE - 1 )) dd if=/dev/zero of=$DEV bs=$BLOCKSIZE seek=$(( $MEDIASIZE - 1 )) count=1 That's pretty much what "gmirror clear /dev/ad0s1a" does. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs." -- Robert Firth From rick-freebsd2008 at kiwi-computer.com Fri Mar 6 12:57:39 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Fri Mar 6 12:57:46 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090304195614.GA179@britannica.bec.de> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> Message-ID: <20090306203057.GA49994@keira.kiwi-computer.com> On Wed, Mar 04, 2009 at 08:56:14PM +0100, Joerg Sonnenberger wrote: > On Wed, Mar 04, 2009 at 08:38:52PM +0100, Oliver Fromme wrote: > > Octavian Covalschi wrote: > > > I'm looking a way to spin down HDD just right before power off. Why? > > > > > > Because currently when I call "shutdown -p now", HDD is powered off at it's > > > full speed (7200.4) and as a result > > > I hear a noise of stopping/spinning down of HDD, and _this_ concerns me as > > > I'm afraid it can damage HDD. > > > > You don't have to spin down a disk before powering it off. > > The noise you hear is probably caused by the "autopark" > > feature of the drive. It is harmless. > > This is not true. Many hard disks don't like having to do an emergency > shutdown as it affects the disk life time negatively. That's what > happens if you poweroff the machine when the disks are still spinning. I believe you are incorrect. Most hard drives do an "autopark" of the head into the landing zone (which is near the spindle) when power is lost. My understanding is that because it is spinning so fast, the heads can fly for quite a long time so the HDD has enough time to autopark and such an operation does not consume much power. Thus the operation can be performed with a little capacitance or by using some of the mechanical energy in the spindle. If drives did not auto, there would be orders of magnitude more failures due to head crashes. The heads absolutely have to be retracted into the landing zone if the spindle speed is too low or the drive will fail. What's actually bad for the drives is the actual spinup and spindowns, which require the head to sit in the very bumpy landing zone until the drive reaches optimal spindle speed and thus enough airflow to safely move the heads around the platter without contact. Strangely, atacontrol(8) has a command for spindown (which is inherently bad for drives yet still a reasonable feature) but there is no command for spinup. I wish there was a spinup command because I've seen drives that won't do a spinup until they receive a special ATA command. I was never able to find any docs, so if anyone knows the command I'd be willing to write a patch against atacontrol! -- Rick C. Petty From octavian.covalschi at gmail.com Fri Mar 6 13:30:15 2009 From: octavian.covalschi at gmail.com (Octavian Covalschi) Date: Fri Mar 6 13:30:22 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090306203057.GA49994@keira.kiwi-computer.com> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> <20090306203057.GA49994@keira.kiwi-computer.com> Message-ID: Why is spinning down is bad for HDD ? I believe it's better to spindown a drive, instead of cutting power too sudden. On Fri, Mar 6, 2009 at 2:30 PM, Rick C. Petty < rick-freebsd2008@kiwi-computer.com> wrote: > On Wed, Mar 04, 2009 at 08:56:14PM +0100, Joerg Sonnenberger wrote: > > On Wed, Mar 04, 2009 at 08:38:52PM +0100, Oliver Fromme wrote: > > > Octavian Covalschi wrote: > > > > I'm looking a way to spin down HDD just right before power off. Why? > > > > > > > > Because currently when I call "shutdown -p now", HDD is powered off > at it's > > > > full speed (7200.4) and as a result > > > > I hear a noise of stopping/spinning down of HDD, and _this_ concerns > me as > > > > I'm afraid it can damage HDD. > > > > > > You don't have to spin down a disk before powering it off. > > > The noise you hear is probably caused by the "autopark" > > > feature of the drive. It is harmless. > > > > This is not true. Many hard disks don't like having to do an emergency > > shutdown as it affects the disk life time negatively. That's what > > happens if you poweroff the machine when the disks are still spinning. > > I believe you are incorrect. Most hard drives do an "autopark" of the head > into the landing zone (which is near the spindle) when power is lost. My > understanding is that because it is spinning so fast, the heads can fly for > quite a long time so the HDD has enough time to autopark and such an > operation does not consume much power. Thus the operation can be performed > with a little capacitance or by using some of the mechanical energy in the > spindle. > > If drives did not auto, there would be orders of magnitude more failures > due to head crashes. The heads absolutely have to be retracted into the > landing zone if the spindle speed is too low or the drive will fail. > > What's actually bad for the drives is the actual spinup and spindowns, > which require the head to sit in the very bumpy landing zone until the > drive reaches optimal spindle speed and thus enough airflow to safely move > the heads around the platter without contact. > > Strangely, atacontrol(8) has a command for spindown (which is inherently > bad for drives yet still a reasonable feature) but there is no command for > spinup. I wish there was a spinup command because I've seen drives that > won't do a spinup until they receive a special ATA command. I was never > able to find any docs, so if anyone knows the command I'd be willing to > write a patch against atacontrol! > > -- Rick C. Petty > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From rick-freebsd2008 at kiwi-computer.com Fri Mar 6 13:47:39 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Fri Mar 6 13:47:46 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> <20090306203057.GA49994@keira.kiwi-computer.com> Message-ID: <20090306214738.GA50654@keira.kiwi-computer.com> On Fri, Mar 06, 2009 at 03:30:14PM -0600, Octavian Covalschi wrote: > Why is spinning down is bad for HDD ? I believe it's better to spindown a > drive, > instead of cutting power too sudden. Comparing those two, I'd say it shouldn't matter (although probably a forced spindown may be better). But pulling power from a drive does not mean the drive immediately stops doing stuff. I was just saying spindown on disks is bad in the first place. Sure, you might save some wear and tear on the bearings, but you risk problems with the heads on both spindown and spinup. In other words, if you can avoid power-cycling your drives, they should last longer (in that you're less likely to destroy the heads). -- Rick C. Petty From rwatson at FreeBSD.org Sat Mar 7 06:37:10 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sat Mar 7 06:37:17 2009 Subject: uma_zone In-Reply-To: <671bb5fc0903040829m7c7ab79ay612868bb4260bd21@mail.gmail.com> References: <671bb5fc0903040829m7c7ab79ay612868bb4260bd21@mail.gmail.com> Message-ID: On Wed, 4 Mar 2009, Alexej Sokolov wrote: > how can I get the size and pointer of some allocated uma zone ? For example: > zone_pack Could you tell us a bit more about the context in which you want to do this? Normally kernel modules acquire pointers to globally visible zones via a symbol dependency resolved by the kernel linker (zone_pack is a globally visible symbol in the kernel). Our general userspace monitoring tools, such as vmstat -z, don't display the UMA zone pointers, and a pointer to the zone is not exported by the sysctls it depends on, currently, but if you run kgdb on kernel.symbols you should be able to print out the address of the global zone_pack directly. Robert N M Watson Computer Laboratory University of Cambridge From martinbadie at yahoo.com Sat Mar 7 13:22:56 2009 From: martinbadie at yahoo.com (Martin Badie) Date: Sat Mar 7 13:23:03 2009 Subject: select.h FD_SETSIZE and Qmail-Postfix test Message-ID: <67469.69113.qm@web59906.mail.ac4.yahoo.com> Hi, There is a test that I am doing with FreeBSD and Linux. This test involves qmail and postfix comparison. Both FreeBSD and Linux seems to have 1024 File Descriptor limit. (FD_SETSIZE in select.h in FreeBSD) . To have a better concurrency in qmail on smtp level. I have used a patch named big-todo patch also used big-concurrency patch. These patches helps me to increase concurrency in operating system. I set concurrent connection to 500(tcpserver -c 500). There is no problem until around 400-500 active smtp connection. But if the total smtp connection exceeds 500, load average increases to ~40-50 but cpu system time arises to %50-60. The strange issue is that, this load increases when the connection is limited to accept 500 connections but the tool I use is configured to 700 (more than 500) connections. Normally ucspi-tcp software limits connection to 500 ( -c 500) I suspect it is something to do with Operating system level. Additionally I have also patched FreeBSD kernel with 4096 FD_SETSIZE in select.h in kernel and booted with that kernel. I have also compiled qmail from scratch to accept 2040 connections (in conf-spawn) but there is no change I mean I still can't get more than decent 500 connections with acceptable load average. I have also used postfix on both Linux FreeBSD: default_process_limit = 500 smtpd_client_connection_count_limit = 500 but I still get strange load when connection raises more than 500 I suspect something is missing or need to be configured on the operating system level (both Linux and FreeBSD) From ticso at cicely7.cicely.de Sun Mar 8 05:36:19 2009 From: ticso at cicely7.cicely.de (Bernd Walter) Date: Sun Mar 8 05:36:26 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090306214738.GA50654@keira.kiwi-computer.com> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> <20090306203057.GA49994@keira.kiwi-computer.com> <20090306214738.GA50654@keira.kiwi-computer.com> Message-ID: <20090308123608.GC82478@cicely7.cicely.de> On Fri, Mar 06, 2009 at 03:47:38PM -0600, Rick C. Petty wrote: > On Fri, Mar 06, 2009 at 03:30:14PM -0600, Octavian Covalschi wrote: > > Why is spinning down is bad for HDD ? I believe it's better to spindown a > > drive, > > instead of cutting power too sudden. > > Comparing those two, I'd say it shouldn't matter (although probably a > forced spindown may be better). But pulling power from a drive does not > mean the drive immediately stops doing stuff. My understanding is that without power the heads just slamm into landing zone, while it can be done in a controlled smooth way with power. > I was just saying spindown on disks is bad in the first place. Sure, you > might save some wear and tear on the bearings, but you risk problems with > the heads on both spindown and spinup. In other words, if you can avoid > power-cycling your drives, they should last longer (in that you're less > likely to destroy the heads). This depends on the disks. Desktop and especially mobile drives are designed to sustain more spin downs, but are not designed for rotating a long time. But of course if you intend to spin up directly after spin down it might be bad for them as well, since it isn't really saving spinning time. This is nothing, which should be done on reboot, but for halts it might be reasonable to do. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From freebsd at jayloden.com Sun Mar 8 07:20:07 2009 From: freebsd at jayloden.com (Jay Loden) Date: Sun Mar 8 07:20:17 2009 Subject: CPU user/kernel time given the PID Message-ID: <49B3D01E.1010600@jayloden.com> I'm working on FreeBSD support for a Python library called psutil for reading process information in a cross-platform fashion. Each platform-specific module is written in C, so the majority of the FreeBSD code is a C interface to various process information. I've been having some trouble working out how to get CPU user/kernel time for a given PID. I took a look at the source to top and ps but neither really helped since they don't seem to cover the info I was looking for (or I missed it). I'm not sure if there's a better way to go about this but I've been looking at sysctl and the kinfo_proc struct - is there somewhere more appropriate to retrieve this information? If the kinfo_proc struct is the way to go, then do I want to use ki_runtime, ki_swtime or something else, and does that mean there's no distinction between user/kern time for a process? If anyone has code samples or recommended docs to get me pointed in the right direction that would be great. Thanks, -Jay From bsd.quest at googlemail.com Sun Mar 8 11:00:32 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Sun Mar 8 11:00:38 2009 Subject: uma_zone In-Reply-To: References: <671bb5fc0903040829m7c7ab79ay612868bb4260bd21@mail.gmail.com> Message-ID: <671bb5fc0903081100x25dc6f4g829039f2ac51015c@mail.gmail.com> 2009/3/7 Robert Watson > On Wed, 4 Mar 2009, Alexej Sokolov wrote: > > how can I get the size and pointer of some allocated uma zone ? For >> example: zone_pack >> > > Could you tell us a bit more about the context in which you want to do > this? Interrupt kontext. > Normally kernel modules acquire pointers to globally visible zones via a > symbol dependency resolved by the kernel linker (zone_pack is a globally > visible symbol in the kernel). But what about the size ? Do the UMA zones have fixed sizes? What I want to do is to remap zone_pack into the user space in order to give user applications access to mbuf clusters with frames. > Our general userspace monitoring tools, such as vmstat -z, don't display > the UMA zone pointers, and a pointer to the zone is not exported by the > sysctls it depends on, currently, but if you run kgdb on kernel.symbols you > should be able to print out the address of the global zone_pack directly. > > Robert N M Watson > Computer Laboratory > University of Cambridge Thanx a lot! From olli at lurza.secnetix.de Sun Mar 8 14:56:54 2009 From: olli at lurza.secnetix.de (Oliver Fromme) Date: Sun Mar 8 14:57:02 2009 Subject: CPU user/kernel time given the PID In-Reply-To: <49B3D01E.1010600@jayloden.com> Message-ID: <200903082156.n28Lup7e085565@lurza.secnetix.de> Jay Loden wrote: > I'm working on FreeBSD support for a Python library called psutil for reading > process information in a cross-platform fashion. Each platform-specific module > is written in C, so the majority of the FreeBSD code is a C interface to various > process information. I've been having some trouble working out how to get CPU > user/kernel time for a given PID. I took a look at the source to top and ps but > neither really helped since they don't seem to cover the info I was looking for > (or I missed it). > > I'm not sure if there's a better way to go about this but I've been looking at > sysctl and the kinfo_proc struct - is there somewhere more appropriate to > retrieve this information? If the kinfo_proc struct is the way to go, then do I > want to use ki_runtime, ki_swtime or something else, and does that mean there's > no distinction between user/kern time for a process? If anyone has code samples > or recommended docs to get me pointed in the right direction that would be great. ps(1) and top(1) both use ki_pctcpu, see the getpcpu() function in src/bin/ps/print.c and format_next_process() in src/usr.bin/top/machine.c As far as I know, there is no distinction between user- mode and kernel-mode CPU time per process. It should also be noted that the kernel's time cannot always be attributed to a certain userland process. I would even guess is that the majority of the CPU time spent in the kernel is not on behalf of a specific userland process. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "C is quirky, flawed, and an enormous success." -- Dennis M. Ritchie. From freebsd at jayloden.com Sun Mar 8 17:33:31 2009 From: freebsd at jayloden.com (Jay Loden) Date: Sun Mar 8 17:33:38 2009 Subject: CPU user/kernel time given the PID In-Reply-To: <200903082156.n28Lup7e085565@lurza.secnetix.de> References: <200903082156.n28Lup7e085565@lurza.secnetix.de> Message-ID: <49B463D7.9010401@jayloden.com> Oliver Fromme wrote: > ps(1) and top(1) both use ki_pctcpu, see the getpcpu() > function in src/bin/ps/print.c and format_next_process() > in src/usr.bin/top/machine.c Hi Oliver, thanks for the reply. I noticed the same after some digging through the source code for ps and top. While CPU usage % is a useful number also, I was hoping to be able to get CPU time(s). Possibly that information simply isn't available on FreeBSD like it is for other OSes. > As far as I know, there is no distinction between user- > mode and kernel-mode CPU time per process. It should > also be noted that the kernel's time cannot always be > attributed to a certain userland process. I would even > guess is that the majority of the CPU time spent in the > kernel is not on behalf of a specific userland process. I would suspect the same, but I did notice that times() does return separate values for user/system time on FreeBSD, so that implies that the system is able to differentiate between the two somehow. If you can get it from within the current running process the data must be there but I've no idea what interface (if any) exists to read that information for other processes. -Jay From dnelson at allantgroup.com Sun Mar 8 18:21:41 2009 From: dnelson at allantgroup.com (Dan Nelson) Date: Sun Mar 8 18:21:48 2009 Subject: CPU user/kernel time given the PID In-Reply-To: <49B463D7.9010401@jayloden.com> References: <200903082156.n28Lup7e085565@lurza.secnetix.de> <49B463D7.9010401@jayloden.com> Message-ID: <20090309012137.GG3398@dan.emsphone.com> In the last episode (Mar 08), Jay Loden said: > Oliver Fromme wrote: > > ps(1) and top(1) both use ki_pctcpu, see the getpcpu() function in > > src/bin/ps/print.c and format_next_process() in > > src/usr.bin/top/machine.c > > Hi Oliver, thanks for the reply. I noticed the same after some digging > through the source code for ps and top. While CPU usage % is a useful > number also, I was hoping to be able to get CPU time(s). Possibly that > information simply isn't available on FreeBSD like it is for other OSes. I was wondering why you were having so much trouble finding what you were looking for, and then I realized I have a patch that I have never submitted a PR for: the addition of "systime" and "usertime" ps keywords :) It simply reads the rusage struct, and returns the same values that getrusage() does. -- Dan Nelson dnelson@allantgroup.com -------------- next part -------------- Index: extern.h =================================================================== RCS file: /home/ncvs/src/bin/ps/extern.h,v retrieving revision 1.37 diff -u -p -r1.37 extern.h --- extern.h 23 Jun 2004 23:48:09 -0000 1.37 +++ extern.h 7 Jan 2005 06:46:15 -0000 @@ -78,11 +78,13 @@ int s_uname(KINFO *); void showkey(void); void started(KINFO *, VARENT *); void state(KINFO *, VARENT *); +void systime(KINFO *, VARENT *); void tdev(KINFO *, VARENT *); void tname(KINFO *, VARENT *); void ucomm(KINFO *, VARENT *); void uname(KINFO *, VARENT *); void upr(KINFO *, VARENT *); +void usertime(KINFO *, VARENT *); void vsize(KINFO *, VARENT *); void wchan(KINFO *, VARENT *); __END_DECLS Index: keyword.c =================================================================== RCS file: /home/ncvs/src/bin/ps/keyword.c,v retrieving revision 1.76 diff -u -p -r1.76 keyword.c --- keyword.c 6 Apr 2006 03:24:31 -0000 1.76 +++ keyword.c 2 Mar 2007 17:23:10 -0000 @@ -185,6 +185,7 @@ static VAR var[] = { UINT, UIDFMT, 0}, {"svuid", "SVUID", NULL, 0, kvar, NULL, UIDLEN, KOFF(ki_svuid), UINT, UIDFMT, 0}, + {"systime", "SYSTIME", NULL, USER, systime, NULL, 9, 0, CHAR, NULL, 0}, {"tdev", "TDEV", NULL, 0, tdev, NULL, 4, 0, CHAR, NULL, 0}, {"time", "TIME", NULL, USER, cputime, NULL, 9, 0, CHAR, NULL, 0}, {"tpgid", "TPGID", NULL, 0, kvar, NULL, 4, KOFF(ki_tpgid), UINT, @@ -203,6 +204,7 @@ static VAR var[] = { "lx", 0}, {"user", "USER", NULL, LJUST|DSIZ, uname, s_uname, USERLEN, 0, CHAR, NULL, 0}, + {"usertime", "USERTIME", NULL, USER, usertime, NULL, 9, 0, CHAR, NULL, 0}, {"usrpri", "", "upr", 0, NULL, NULL, 0, 0, CHAR, NULL, 0}, {"vsize", "", "vsz", 0, NULL, NULL, 0, 0, CHAR, NULL, 0}, {"vsz", "VSZ", NULL, 0, vsize, NULL, 5, 0, CHAR, NULL, 0}, Index: print.c =================================================================== RCS file: /home/ncvs/src/bin/ps/print.c,v retrieving revision 1.95 diff -u -p -r1.95 print.c --- print.c 17 Sep 2007 05:27:18 -0000 1.95 +++ print.c 11 Oct 2007 19:54:02 -0000 @@ -551,6 +551,79 @@ cputime(KINFO *k, VARENT *ve) } void +systime(KINFO *k, VARENT *ve) +{ + VAR *v; + long secs; + long psecs; /* "parts" of a second. first micro, then centi */ + char obuff[128]; + static char decimal_point; + + if (decimal_point == '\0') + decimal_point = localeconv()->decimal_point[0]; + v = ve->var; + if (!k->ki_valid) { + secs = 0; + psecs = 0; + } else { + /* + * This counts time spent handling interrupts. We could + * fix this, but it is not 100% trivial (and interrupt + * time fractions only work on the sparc anyway). XXX + */ + secs = k->ki_p->ki_rusage.ru_stime.tv_sec; + psecs = k->ki_p->ki_rusage.ru_stime.tv_usec; + if (sumrusage) { + secs += k->ki_p->ki_childstime.tv_sec; + psecs += k->ki_p->ki_childstime.tv_usec; + } + /* + * round and scale to 100's + */ + psecs = (psecs + 5000) / 10000; + secs += psecs / 100; + psecs = psecs % 100; + } + (void)snprintf(obuff, sizeof(obuff), "%3ld:%02ld%c%02ld", + secs / 60, secs % 60, decimal_point, psecs); + (void)printf("%*s", v->width, obuff); +} + +void +usertime(KINFO *k, VARENT *ve) +{ + VAR *v; + long secs; + long psecs; /* "parts" of a second. first micro, then centi */ + char obuff[128]; + static char decimal_point; + + if (decimal_point == '\0') + decimal_point = localeconv()->decimal_point[0]; + v = ve->var; + if (!k->ki_valid) { + secs = 0; + psecs = 0; + } else { + secs = k->ki_p->ki_rusage.ru_utime.tv_sec; + psecs = k->ki_p->ki_rusage.ru_utime.tv_usec; + if (sumrusage) { + secs += k->ki_p->ki_childutime.tv_sec; + psecs += k->ki_p->ki_childutime.tv_usec; + } + /* + * round and scale to 100's + */ + psecs = (psecs + 5000) / 10000; + secs += psecs / 100; + psecs = psecs % 100; + } + (void)snprintf(obuff, sizeof(obuff), "%3ld:%02ld%c%02ld", + secs / 60, secs % 60, decimal_point, psecs); + (void)printf("%*s", v->width, obuff); +} + +void elapsed(KINFO *k, VARENT *ve) { VAR *v; Index: ps.1 =================================================================== RCS file: /home/ncvs/src/bin/ps/ps.1,v retrieving revision 1.89 diff -u -p -r1.89 ps.1 --- ps.1 17 Sep 2006 17:40:06 -0000 1.89 +++ ps.1 2 Mar 2007 17:23:11 -0000 @@ -571,6 +571,8 @@ symbolic process state (alias saved gid from a setgid executable .It Cm svuid saved UID from a setuid executable +.It Cm systime +accumulated system CPU time .It Cm tdev control terminal device number .It Cm time @@ -599,6 +601,8 @@ scheduling priority on return from syste .Cm usrpri ) .It Cm user user name (from UID) +.It Cm usertime +accumulated user CPU time .It Cm vsz virtual size in Kbytes (alias .Cm vsize ) From david at catwhisker.org Sun Mar 8 19:03:16 2009 From: david at catwhisker.org (David Wolfskill) Date: Sun Mar 8 19:03:24 2009 Subject: CPU user/kernel time given the PID In-Reply-To: <49B463D7.9010401@jayloden.com> References: <200903082156.n28Lup7e085565@lurza.secnetix.de> <49B463D7.9010401@jayloden.com> Message-ID: <20090309020315.GW4315@albert.catwhisker.org> On Sun, Mar 08, 2009 at 08:33:27PM -0400, Jay Loden wrote: > Oliver Fromme wrote: > > ps(1) and top(1) both use ki_pctcpu, see the getpcpu() > > function in src/bin/ps/print.c and format_next_process() > > in src/usr.bin/top/machine.c > > Hi Oliver, thanks for the reply. I noticed the same after some digging through > the source code for ps and top. While CPU usage % is a useful number also, I was > hoping to be able to get CPU time(s). Possibly that information simply isn't > available on FreeBSD like it is for other OSes. Have you checked to see if you can make use of the information provided by procfs(5)? In particular, I note: ... status The process status. This file is read-only and returns a single line containing multiple space-separated fields as follows: o command name o process id ... o the process start time in seconds and microseconds, comma separated. o the user time in seconds and microseconds, comma separated. o the system time in seconds and microseconds, comma separated. o the wait channel message .... Thus, on my laptop, I see: g1-35(6.4-S)[1] cat /proc/`pgrep firefox-bin`/status firefox-bin 1735 1730 1549 1454 - noflags 1236526247,367664 3289,390208 477,843140 -kse- 1001 1001 1001,1001,1001,0,20,68,69,1004 - g1-35(6.4-S)[2] So above-listed items would be: * firefox-bin * 1735 ... * 1236526247,367664 * 3289,390208 * 477,843140 * -kse- * -kse- .... Granted, not every machine will necessarily have PROCFS in the kernel configuration, but it is in GENERIC. > ... Peace, david -- David H. Wolfskill david@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090309/f1813e19/attachment.pgp From freebsd at jayloden.com Mon Mar 9 06:35:41 2009 From: freebsd at jayloden.com (Jay Loden) Date: Mon Mar 9 06:35:48 2009 Subject: CPU user/kernel time given the PID In-Reply-To: <20090309012137.GG3398@dan.emsphone.com> References: <200903082156.n28Lup7e085565@lurza.secnetix.de> <49B463D7.9010401@jayloden.com> <20090309012137.GG3398@dan.emsphone.com> Message-ID: <49B51B28.8040604@jayloden.com> Dan Nelson wrote: > I was wondering why you were having so much trouble finding what you were > looking for, and then I realized I have a patch that I have never submitted > a PR for: the addition of "systime" and "usertime" ps keywords :) It simply > reads the rusage struct, and returns the same values that getrusage() does. Dan, this is great, exactly what I was looking for, didn't think to look for 'rusage' in the values in kinfo_proc. Thanks! -Jay From jhb at freebsd.org Mon Mar 9 10:22:46 2009 From: jhb at freebsd.org (John Baldwin) Date: Mon Mar 9 10:22:54 2009 Subject: wrong data in remapped buffer In-Reply-To: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> References: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> Message-ID: <200903090821.25871.jhb@freebsd.org> On Friday 06 March 2009 11:13:38 am Alexej Sokolov wrote: > Hello, > I try to MALLOC a buffer in kern, then remap it with vm_map_find(), to space > of user process. > Some times the remapped buffer in user space contain incorrect data. What architecture are you using? On some archs like amd64, small mallocs (<= PAGE_SIZE) don't use the kmem_map or kmem_object. -- John Baldwin From cliftonr at lava.net Mon Mar 9 11:27:04 2009 From: cliftonr at lava.net (Clifton Royston) Date: Mon Mar 9 11:27:12 2009 Subject: freebsd-hackers Digest, Vol 310, Issue 6 In-Reply-To: <20090308120022.523F81065672@hub.freebsd.org> References: <20090308120022.523F81065672@hub.freebsd.org> Message-ID: <20090309182700.GA20062@lava.net> On Sun, Mar 08, 2009 at 12:00:22PM +0000, freebsd-hackers-request@freebsd.org wrote: > Date: Sat, 7 Mar 2009 13:08:56 -0800 (PST) > From: Martin Badie > Subject: select.h FD_SETSIZE and Qmail-Postfix test > To: freebsd-hackers@freebsd.org > Message-ID: <67469.69113.qm@web59906.mail.ac4.yahoo.com> > Content-Type: text/plain; charset=us-ascii > > Hi, > > There is a test that I am doing with FreeBSD and Linux. This test > involves qmail and postfix comparison. Both FreeBSD and Linux seems > to have 1024 File Descriptor limit. (FD_SETSIZE in select.h in > FreeBSD) . > > To have a better concurrency in qmail on smtp level. I have used a > patch named big-todo patch also used big-concurrency patch. These > patches helps me to increase concurrency in operating system. I set > concurrent connection to 500(tcpserver -c 500). There is no problem > until around 400-500 active smtp connection. But if the total smtp > connection exceeds 500, load average increases to ~40-50 but cpu > system time arises to %50-60. The strange issue is that, this load > increases when the connection is limited to accept 500 connections > but the tool I use is configured to 700 (more than 500) connections. > Normally ucspi-tcp software limits connection to 500 ( -c 500) I > suspect it is something to do with Operating system level. > > Additionally I have also patched FreeBSD kernel with 4096 FD_SETSIZE > in select.h in kernel and booted with that kernel. I have also > compiled qmail from scratch to accept 2040 connections (in > conf-spawn) but there is no change I mean I still can't get more than > decent 500 connections with acceptable load average. > > I have also used postfix on both Linux FreeBSD: > > default_process_limit = 500 > smtpd_client_connection_count_limit = 500 > > but I still get strange load when connection raises more than 500 > > I suspect something is missing or need to be configured on the operating system level (both Linux and FreeBSD) One point which you might be missing is that both FreeBSD and Linux (and I think most other modern OSes) have long since deprecated the select interface for high performance/high concurrency software. On FreeBSD the preferred mechanism is kqueue, and IIRC Postfix prefers to build with the kqueue interface on FreeBSD. Linux uses something else which escapes me at the moment; perhaps epoll? This makes benchmarks on select() primarily of historic interest. -- Clifton -- Clifton Royston -- cliftonr@iandicomputing.com / cliftonr@lava.net President - I and I Computing * http://www.iandicomputing.com/ Custom programming, network design, systems and network consulting services From bsd.quest at googlemail.com Mon Mar 9 12:38:58 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Mon Mar 9 12:39:05 2009 Subject: wrong data in remapped buffer In-Reply-To: <200903090821.25871.jhb@freebsd.org> References: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> <200903090821.25871.jhb@freebsd.org> Message-ID: <671bb5fc0903091238q2c4e4bd7m661333a509395b61@mail.gmail.com> 2009/3/9 John Baldwin > On Friday 06 March 2009 11:13:38 am Alexej Sokolov wrote: > > Hello, > > I try to MALLOC a buffer in kern, then remap it with vm_map_find(), to > space > > of user process. > > Some times the remapped buffer in user space contain incorrect data. > > What architecture are you using? On some archs like amd64, small mallocs > (<= > PAGE_SIZE) don't use the kmem_map or kmem_object. > > -- > John Baldwin > anyway , the error happens only some times... I think there is other reason. My hardware is amd64 % uname -ms FreeBSD i386 From jhb at freebsd.org Mon Mar 9 13:42:32 2009 From: jhb at freebsd.org (John Baldwin) Date: Mon Mar 9 13:42:38 2009 Subject: wrong data in remapped buffer In-Reply-To: <671bb5fc0903091238q2c4e4bd7m661333a509395b61@mail.gmail.com> References: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> <200903090821.25871.jhb@freebsd.org> <671bb5fc0903091238q2c4e4bd7m661333a509395b61@mail.gmail.com> Message-ID: <200903091618.32955.jhb@freebsd.org> On Monday 09 March 2009 3:38:55 pm Alexej Sokolov wrote: > 2009/3/9 John Baldwin > > > On Friday 06 March 2009 11:13:38 am Alexej Sokolov wrote: > > > Hello, > > > I try to MALLOC a buffer in kern, then remap it with vm_map_find(), to > > space > > > of user process. > > > Some times the remapped buffer in user space contain incorrect data. > > > > What architecture are you using? On some archs like amd64, small mallocs > > (<= > > PAGE_SIZE) don't use the kmem_map or kmem_object. > > > > -- > > John Baldwin > > > anyway , the error happens only some times... I think there is other reason. > My hardware is amd64 > % uname -ms > FreeBSD i386 i386 always uses kmem for malloc(9). -- John Baldwin From rick-freebsd2008 at kiwi-computer.com Mon Mar 9 15:17:17 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Mon Mar 9 15:17:25 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090308123608.GC82478@cicely7.cicely.de> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> <20090306203057.GA49994@keira.kiwi-computer.com> <20090306214738.GA50654@keira.kiwi-computer.com> <20090308123608.GC82478@cicely7.cicely.de> Message-ID: <20090309221715.GA77196@keira.kiwi-computer.com> On Sun, Mar 08, 2009 at 01:36:09PM +0100, Bernd Walter wrote: > On Fri, Mar 06, 2009 at 03:47:38PM -0600, Rick C. Petty wrote: > > On Fri, Mar 06, 2009 at 03:30:14PM -0600, Octavian Covalschi wrote: > > > Why is spinning down is bad for HDD ? I believe it's better to spindown a > > > drive, > > > instead of cutting power too sudden. > > > > Comparing those two, I'd say it shouldn't matter (although probably a > > forced spindown may be better). But pulling power from a drive does not > > mean the drive immediately stops doing stuff. > > My understanding is that without power the heads just slamm into > landing zone, while it can be done in a controlled smooth way with > power. Nope, according to a coworker (whose wife works for an HDD manufacturer), the spindle motor is shunted and the generated electricity is used to properly land the head. My coworker also tells me that some new drives are actually parking the heads off the disk, which as I understand is a much more difficult task since you have to worry about vertical separation when you bring the heads back between the platters. > > I was just saying spindown on disks is bad in the first place. Sure, you > > might save some wear and tear on the bearings, but you risk problems with > > the heads on both spindown and spinup. In other words, if you can avoid > > power-cycling your drives, they should last longer (in that you're less > > likely to destroy the heads). > > This depends on the disks. > Desktop and especially mobile drives are designed to sustain more > spin downs, but are not designed for rotating a long time. > But of course if you intend to spin up directly after spin down it > might be bad for them as well, since it isn't really saving spinning > time. That may be; I know nothing about differences with mobile drives. If this is true, I'd like to find some replacement 2.5" drives which are intended for continuous spinning. > This is nothing, which should be done on reboot, but for halts it > might be reasonable to do. Not sure what you're trying to say here, but I am for the idea of issuing a spindown request if we know the power is to be cycled. If spindown are issued for all halts, I hope someone makes that a kernel tunable. What I was hoping is that someone could point me to the "spinup" command as I have a drive which does not spin up until it receives this command. Any takers? -- Rick C. Petty From timothy at redaelli.eu Mon Mar 9 15:20:54 2009 From: timothy at redaelli.eu (Timothy Redaelli) Date: Mon Mar 9 15:21:00 2009 Subject: lockf: Invalid argument on pipe Message-ID: Hi, Why can't I do a lockf on a file descriptor that does not point a real file (such as stderr, stdout, or a character device)? Since it works under NetBSD, Linux, Solaris. For portability between systems I hope I can do it under FreeBSD. The following code is simple, but It reproduce the problem. Under non-FreeBSD systems, It will block before the puts. Instead under FreeBSD the lockf calls return error and, so, the lock does not works. Any suggest? #include #include #include #include int main(int argc, char *argv[]) { char tmp[256]; if (lockf(2, F_LOCK, 0) == -1) perror("lock"); snprintf (tmp, 256, "%s XXX", argv[0]); if (!argv[1] || strcmp(argv[1], "XXX")) system(tmp); puts("You should see it only after ctrl+c"); return EXIT_SUCCESS; } -- Timothy Redaelli IT Consultant Email: timothy@redaelli.eu Mobile: +39 (338) 1187273 WWW: http://www.redaelli.eu/ From joerg at britannica.bec.de Mon Mar 9 15:31:11 2009 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Mon Mar 9 15:31:18 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090309221715.GA77196@keira.kiwi-computer.com> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> <20090306203057.GA49994@keira.kiwi-computer.com> <20090306214738.GA50654@keira.kiwi-computer.com> <20090308123608.GC82478@cicely7.cicely.de> <20090309221715.GA77196@keira.kiwi-computer.com> Message-ID: <20090309222256.GA16286@britannica.bec.de> On Mon, Mar 09, 2009 at 04:17:15PM -0600, Rick C. Petty wrote: > What I was hoping is that someone could point me to the "spinup" command as > I have a drive which does not spin up until it receives this command. Any > takers? There is no such command. Disks are supposed to spin up at the first read/write automatically. It can take a while though. Joerg From ticso at cicely7.cicely.de Mon Mar 9 16:19:39 2009 From: ticso at cicely7.cicely.de (Bernd Walter) Date: Mon Mar 9 16:19:47 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <20090309221715.GA77196@keira.kiwi-computer.com> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <20090304195614.GA179@britannica.bec.de> <20090306203057.GA49994@keira.kiwi-computer.com> <20090306214738.GA50654@keira.kiwi-computer.com> <20090308123608.GC82478@cicely7.cicely.de> <20090309221715.GA77196@keira.kiwi-computer.com> Message-ID: <20090309231932.GE6840@cicely7.cicely.de> On Mon, Mar 09, 2009 at 04:17:15PM -0600, Rick C. Petty wrote: > On Sun, Mar 08, 2009 at 01:36:09PM +0100, Bernd Walter wrote: > > On Fri, Mar 06, 2009 at 03:47:38PM -0600, Rick C. Petty wrote: > > > On Fri, Mar 06, 2009 at 03:30:14PM -0600, Octavian Covalschi wrote: > > > > Why is spinning down is bad for HDD ? I believe it's better to spindown a > > > > drive, > > > > instead of cutting power too sudden. > > > > > > Comparing those two, I'd say it shouldn't matter (although probably a > > > forced spindown may be better). But pulling power from a drive does not > > > mean the drive immediately stops doing stuff. > > > > My understanding is that without power the heads just slamm into > > landing zone, while it can be done in a controlled smooth way with > > power. > > Nope, according to a coworker (whose wife works for an HDD manufacturer), > the spindle motor is shunted and the generated electricity is used to > properly land the head. My coworker also tells me that some new drives are > actually parking the heads off the disk, which as I understand is a much > more difficult task since you have to worry about vertical separation when > you bring the heads back between the platters. The ramp load/unload thing is true: http://www.hitachigst.com/tech/techlib.nsf/techdocs/9076679E3EE4003E86256FAB005825FB/$file/LoadUnload_white_paper_FINAL.pdf Some drives also have the ramp on the inner side. This highly disagrees with the loud clank noise that some disks are doing on power loss. The myth about generating emergency power from spindle rotation is very old, but people from (other?) HDD manufactorers denied that. The above document claims that their drives are also doing power reclaiming from rotation. Another used technology however is using the air current from the rotation or a loaded spring. > > > I was just saying spindown on disks is bad in the first place. Sure, you > > > might save some wear and tear on the bearings, but you risk problems with > > > the heads on both spindown and spinup. In other words, if you can avoid > > > power-cycling your drives, they should last longer (in that you're less > > > likely to destroy the heads). > > > > This depends on the disks. > > Desktop and especially mobile drives are designed to sustain more > > spin downs, but are not designed for rotating a long time. > > But of course if you intend to spin up directly after spin down it > > might be bad for them as well, since it isn't really saving spinning > > time. > > That may be; I know nothing about differences with mobile drives. If this > is true, I'd like to find some replacement 2.5" drives which are intended > for continuous spinning. There are a lots of drives available, the market first came up with blade systems. The unfortunate thing is that 2,5" for contiuous use are usually high speed drives, which takes a lot of power, which makes them a bad choice for 24/7 low power devices. > > This is nothing, which should be done on reboot, but for halts it > > might be reasonable to do. > > Not sure what you're trying to say here, but I am for the idea of > issuing a spindown request if we know the power is to be cycled. If > spindown are issued for all halts, I hope someone makes that a kernel > tunable. I ment, that we shouldn't do this for shutdown -r. In all other cases I asume that it can't hurt even if it is not required for specific drives. > What I was hoping is that someone could point me to the "spinup" command as > I have a drive which does not spin up until it receives this command. Any > takers? For CAM there is camcontrol start. Not sure about ATA drives. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From rwatson at FreeBSD.org Mon Mar 9 16:35:59 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Mon Mar 9 16:36:06 2009 Subject: lockf: Invalid argument on pipe In-Reply-To: References: Message-ID: On Mon, 9 Mar 2009, Timothy Redaelli wrote: > Why can't I do a lockf on a file descriptor that does not point a real file > (such as stderr, stdout, or a character device)? > > Since it works under NetBSD, Linux, Solaris. For portability between systems > I hope I can do it under FreeBSD. > > The following code is simple, but It reproduce the problem. Under > non-FreeBSD systems, It will block before the puts. Instead under FreeBSD > the lockf calls return error and, so, the lock does not works. Could you file a PR for this, with pretty much this e-mail and sample code included? There's no real reason not for it to work other than that it is likely not implemented for devfs; that should be easy to fix it but opening a PR will help us keep track of the fact that it wants to be fixed. thanks, Robert N M Watson Computer Laboratory University of Cambridge > > Any suggest? > > > #include > #include > #include > #include > > int main(int argc, char *argv[]) { > char tmp[256]; > > if (lockf(2, F_LOCK, 0) == -1) > perror("lock"); > snprintf (tmp, 256, "%s XXX", argv[0]); > if (!argv[1] || strcmp(argv[1], "XXX")) > system(tmp); > puts("You should see it only after ctrl+c"); > return EXIT_SUCCESS; > } > > > -- > Timothy Redaelli > IT Consultant > Email: timothy@redaelli.eu > Mobile: +39 (338) 1187273 > WWW: http://www.redaelli.eu/ > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From ken at mthelicon.com Mon Mar 9 18:04:57 2009 From: ken at mthelicon.com (Pegasus Mc Cleaft) Date: Mon Mar 9 18:05:03 2009 Subject: bsdtar lockup on Current-03/10/2009 Message-ID: <200903100104.53847.ken@mthelicon.com> Hi Current & Hackers, I was wondering if anyone else is seeing this problem: Any use of bsdtar to create a new archive causes the process to be unresponsive to all signals and consumes 100% cpu time. The machine I am testing on is a Core 2 quad running in AMD64 (8 gigs ram, zfs boot, root, et al.) I have tried disabling the zil and prefetch as a precaution but can still cause the failure by doing the command below(trying to eliminate zfs writes as being the problem): #tar -cvf /dev/null * Unarchiving from tar seems to work OK. The bug may have been introduced a few days ago. I just noticed my machine doing this tonight when I tried to do a portupgrade and the creation of the backups locked up. Thanks, Peg From bsd.quest at googlemail.com Tue Mar 10 03:16:30 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Tue Mar 10 03:16:37 2009 Subject: wrong data in remapped buffer In-Reply-To: <200903091618.32955.jhb@freebsd.org> References: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> <200903090821.25871.jhb@freebsd.org> <671bb5fc0903091238q2c4e4bd7m661333a509395b61@mail.gmail.com> <200903091618.32955.jhb@freebsd.org> Message-ID: <671bb5fc0903100316s16c0ae36ocaac3cdab955584d@mail.gmail.com> 2009/3/9 John Baldwin > On Monday 09 March 2009 3:38:55 pm Alexej Sokolov wrote: > > 2009/3/9 John Baldwin > > > > > On Friday 06 March 2009 11:13:38 am Alexej Sokolov wrote: > > > > Hello, > > > > I try to MALLOC a buffer in kern, then remap it with vm_map_find(), > to > > > space > > > > of user process. > > > > Some times the remapped buffer in user space contain incorrect data. > > > > > > What architecture are you using? On some archs like amd64, small > mallocs > > > (<= > > > PAGE_SIZE) don't use the kmem_map or kmem_object. > > > > > > -- > > > John Baldwin > > > > > anyway , the error happens only some times... I think there is other > reason. > > My hardware is amd64 > > % uname -ms > > FreeBSD i386 > > i386 always uses kmem for malloc(9). > > -- > John Baldwin ok, and what should be a reason of inconsistent data after remapping ? From kostikbel at gmail.com Tue Mar 10 04:17:21 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Tue Mar 10 04:17:28 2009 Subject: lockf: Invalid argument on pipe In-Reply-To: References: Message-ID: <20090310111712.GA41617@deviant.kiev.zoral.com.ua> On Mon, Mar 09, 2009 at 11:35:58PM +0000, Robert Watson wrote: > > On Mon, 9 Mar 2009, Timothy Redaelli wrote: > > >Why can't I do a lockf on a file descriptor that does not point a real > >file (such as stderr, stdout, or a character device)? > > > >Since it works under NetBSD, Linux, Solaris. For portability between > >systems I hope I can do it under FreeBSD. > > > >The following code is simple, but It reproduce the problem. Under > >non-FreeBSD systems, It will block before the puts. Instead under FreeBSD > >the lockf calls return error and, so, the lock does not works. > > Could you file a PR for this, with pretty much this e-mail and sample code > included? There's no real reason not for it to work other than that it is > likely not implemented for devfs; that should be easy to fix it but opening > a PR will help us keep track of the fact that it wants to be fixed. > > thanks, > > Robert N M Watson > Computer Laboratory > University of Cambridge > > > > >Any suggest? > > > > > >#include > >#include > >#include > >#include > > > >int main(int argc, char *argv[]) { > > char tmp[256]; > > > > if (lockf(2, F_LOCK, 0) == -1) > > perror("lock"); > > snprintf (tmp, 256, "%s XXX", argv[0]); > > if (!argv[1] || strcmp(argv[1], "XXX")) > > system(tmp); > > puts("You should see it only after ctrl+c"); > > return EXIT_SUCCESS; > >} > > It is explicitely disabled in devfs code. The following patch works for me. diff --git a/sys/fs/devfs/devfs_vnops.c b/sys/fs/devfs/devfs_vnops.c index 1087452..b890da7 100644 --- a/sys/fs/devfs/devfs_vnops.c +++ b/sys/fs/devfs/devfs_vnops.c @@ -452,14 +452,6 @@ devfs_access(struct vop_access_args *ap) /* ARGSUSED */ static int -devfs_advlock(struct vop_advlock_args *ap) -{ - - return (ap->a_flags & F_FLOCK ? EOPNOTSUPP : EINVAL); -} - -/* ARGSUSED */ -static int devfs_close(struct vop_close_args *ap) { struct vnode *vp = ap->a_vp, *oldvp; @@ -1552,7 +1544,6 @@ static struct vop_vector devfs_specops = { .vop_default = &default_vnodeops, .vop_access = devfs_access, - .vop_advlock = devfs_advlock, .vop_bmap = VOP_PANIC, .vop_close = devfs_close, .vop_create = VOP_PANIC, -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090310/ee95ae07/attachment.pgp From jhb at freebsd.org Tue Mar 10 06:39:59 2009 From: jhb at freebsd.org (John Baldwin) Date: Tue Mar 10 06:40:05 2009 Subject: wrong data in remapped buffer In-Reply-To: <671bb5fc0903100316s16c0ae36ocaac3cdab955584d@mail.gmail.com> References: <671bb5fc0903060813s284673e2t4d3c77b0ed6abc54@mail.gmail.com> <200903091618.32955.jhb@freebsd.org> <671bb5fc0903100316s16c0ae36ocaac3cdab955584d@mail.gmail.com> Message-ID: <200903100931.34071.jhb@freebsd.org> On Tuesday 10 March 2009 6:16:27 am Alexej Sokolov wrote: > 2009/3/9 John Baldwin > > > On Monday 09 March 2009 3:38:55 pm Alexej Sokolov wrote: > > > 2009/3/9 John Baldwin > > > > > > > On Friday 06 March 2009 11:13:38 am Alexej Sokolov wrote: > > > > > Hello, > > > > > I try to MALLOC a buffer in kern, then remap it with vm_map_find(), > > to > > > > space > > > > > of user process. > > > > > Some times the remapped buffer in user space contain incorrect data. > > > > > > > > What architecture are you using? On some archs like amd64, small > > mallocs > > > > (<= > > > > PAGE_SIZE) don't use the kmem_map or kmem_object. > > > > > > > > -- > > > > John Baldwin > > > > > > > anyway , the error happens only some times... I think there is other > > reason. > > > My hardware is amd64 > > > % uname -ms > > > FreeBSD i386 > > > > i386 always uses kmem for malloc(9). > > > > -- > > John Baldwin > > ok, > and what should be a reason of inconsistent data after remapping ? I don't know off the top of my head. I'm not really sure your use of vm_map_find() is correct, but I don't know it well enough to comment further. -- John Baldwin From bsd.quest at googlemail.com Tue Mar 10 09:05:53 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Tue Mar 10 09:06:00 2009 Subject: write protection by mmap of /dev/mem doesn't work Message-ID: <671bb5fc0903100905u7484cd1dp9faefc8bc208a6a6@mail.gmail.com> hello, How can I mmap some memory regions with PROT_WRITE protection flag ? What i do: /* Open mem device */ if ((devmem_fd = open("/dev/mem", O_RDWR)) == -1){ perror("/dev/mem"); exit (1); } then if I try to mmap some memory region with PROT_READ it goes Ok. But by PROT_WRITE it doesn't work: sp = mmap ( 0, MCLBYTES, /* Size of remapped buffer = size of mbuf cluster */ PROT_WRITE, MAP_SHARED, devmem_fd, phys_addr /* Physical addres of packet buffer from descriptor */ ); I get by PROT_WRITE " segmentation fault" What is the problem here ? And question again: How can I do it possible to remapp the kernel memory region to user space process through /dev/mem and give to this user process write permissions to remmaped space ? Thanx From bsd.quest at googlemail.com Tue Mar 10 09:32:45 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Tue Mar 10 09:32:52 2009 Subject: Fwd: write protection by mmap of /dev/mem doesn't work In-Reply-To: <671bb5fc0903100905u7484cd1dp9faefc8bc208a6a6@mail.gmail.com> References: <671bb5fc0903100905u7484cd1dp9faefc8bc208a6a6@mail.gmail.com> Message-ID: <671bb5fc0903100932y71e6c78bq8904224959e24e55@mail.gmail.com> Sorry, it was my mistake ! seg fault was by reading of data. To do this should PROT_READ|PROT_WRITE be setted. Now it works! Alexej ---------- Forwarded message ---------- From: Alexej Sokolov Date: 2009/3/10 Subject: write protection by mmap of /dev/mem doesn't work To: freebsd-hackers@freebsd.org hello, How can I mmap some memory regions with PROT_WRITE protection flag ? What i do: /* Open mem device */ if ((devmem_fd = open("/dev/mem", O_RDWR)) == -1){ perror("/dev/mem"); exit (1); } then if I try to mmap some memory region with PROT_READ it goes Ok. But by PROT_WRITE it doesn't work: sp = mmap ( 0, MCLBYTES, /* Size of remapped buffer = size of mbuf cluster */ PROT_WRITE, MAP_SHARED, devmem_fd, phys_addr /* Physical addres of packet buffer from descriptor */ ); I get by PROT_WRITE " segmentation fault" What is the problem here ? And question again: How can I do it possible to remapp the kernel memory region to user space process through /dev/mem and give to this user process write permissions to remmaped space ? Thanx From vasanth.raonaik at gmail.com Tue Mar 10 10:26:29 2009 From: vasanth.raonaik at gmail.com (vasanth raonaik) Date: Tue Mar 10 10:26:35 2009 Subject: Debugging init process. Message-ID: Hello Team, I need to debug init process. I am not able to attach init to gdb and it throws GNU gdb 6.5 [juniper_2006a_411] Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-specifix.com-freebsd"... (gdb) attach 1 Attaching to program: /var/tmp/abhi/init, process 1 ptrace: Invalid argument. (gdb) Can any one help me out in debugging init step execution. Thanks in advance, Vasanth From pluknet at gmail.com Tue Mar 10 13:13:52 2009 From: pluknet at gmail.com (pluknet) Date: Tue Mar 10 13:13:59 2009 Subject: Debugging init process. In-Reply-To: References: Message-ID: 2009/3/10 vasanth raonaik : > Hello Team, > > I need to debug init process. I am not able to attach init to gdb and it > throws > That is because init is a system process, which you cannot trace by design (see ptrace(2)). $ ps -o flags -p 1 F 10004200 , where from ps(1): P_SYSTEM 0x00200 System proc: no sigs, stats or swapping -- wbr, pluknet From gabor at FreeBSD.org Tue Mar 10 14:51:46 2009 From: gabor at FreeBSD.org (=?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?=) Date: Tue Mar 10 14:51:53 2009 Subject: fgetc doubts Message-ID: <49B6DC95.9070607@FreeBSD.org> Hello, I have a problem when reading files with fgetc when a 0xff character comes. In my code the reading stops at that point as if EOF had been reached, but that's not actually the case. The code is here: http://p4web.freebsd.org/@md=d&cd=//&c=Nsd@//depot/projects/soc2008/gabor_textproc/grep/file.c?ac=64&rev1=40 And the problem occurs in grep_fgetln() when the buffers is being filled in: for (; i < bufsiz && !grep_feof(f); i++) binbuf[i] = grep_fgetc(f); Thanks in advance, -- Gabor Kovesdan FreeBSD Volunteer EMAIL: gabor@FreeBSD.org .:|:. gabor@kovesdan.org WEB: http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org From ed at 80386.nl Tue Mar 10 14:55:14 2009 From: ed at 80386.nl (Ed Schouten) Date: Tue Mar 10 14:55:20 2009 Subject: fgetc doubts In-Reply-To: <49B6DC95.9070607@FreeBSD.org> References: <49B6DC95.9070607@FreeBSD.org> Message-ID: <20090310215512.GI31961@hoeg.nl> * G?bor K?vesd?n wrote: > Hello, > > I have a problem when reading files with fgetc when a 0xff character > comes. In my code the reading stops at that point as if EOF had been > reached, but that's not actually the case. > The code is here: > http://p4web.freebsd.org/@md=d&cd=//&c=Nsd@//depot/projects/soc2008/gabor_textproc/grep/file.c?ac=64&rev1=40 > And the problem occurs in grep_fgetln() when the buffers is being filled in: > for (; i < bufsiz && !grep_feof(f); i++) > binbuf[i] = grep_fgetc(f); > > Thanks in advance, Sign extension bug? -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090310/a0bdc00c/attachment.pgp From gabor at FreeBSD.org Tue Mar 10 15:00:53 2009 From: gabor at FreeBSD.org (=?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?=) Date: Tue Mar 10 15:00:59 2009 Subject: fgetc doubts In-Reply-To: <20090310215512.GI31961@hoeg.nl> References: <49B6DC95.9070607@FreeBSD.org> <20090310215512.GI31961@hoeg.nl> Message-ID: <49B6E30F.7020205@FreeBSD.org> Ed Schouten escribi?: > * G?bor K?vesd?n wrote: > >> Hello, >> >> I have a problem when reading files with fgetc when a 0xff character >> comes. In my code the reading stops at that point as if EOF had been >> reached, but that's not actually the case. >> The code is here: >> http://p4web.freebsd.org/@md=d&cd=//&c=Nsd@//depot/projects/soc2008/gabor_textproc/grep/file.c?ac=64&rev1=40 >> And the problem occurs in grep_fgetln() when the buffers is being filled in: >> for (; i < bufsiz && !grep_feof(f); i++) >> binbuf[i] = grep_fgetc(f); >> >> Thanks in advance, >> > > Sign extension bug? > I tried to substitute everything with int, because fgetc can return some error code afaik, but using int didn't help. -- Gabor Kovesdan FreeBSD Volunteer EMAIL: gabor@FreeBSD.org .:|:. gabor@kovesdan.org WEB: http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org From delphij at delphij.net Tue Mar 10 15:24:33 2009 From: delphij at delphij.net (Xin LI) Date: Tue Mar 10 15:24:41 2009 Subject: fgetc doubts In-Reply-To: <49B6E30F.7020205@FreeBSD.org> References: <49B6DC95.9070607@FreeBSD.org> <20090310215512.GI31961@hoeg.nl> <49B6E30F.7020205@FreeBSD.org> Message-ID: <49B6E895.9040701@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, G?bor, G?bor K?vesd?n wrote: > Ed Schouten escribi?: >> * G?bor K?vesd?n wrote: >> >>> Hello, >>> >>> I have a problem when reading files with fgetc when a 0xff character >>> comes. In my code the reading stops at that point as if EOF had been >>> reached, but that's not actually the case. >>> The code is here: >>> http://p4web.freebsd.org/@md=d&cd=//&c=Nsd@//depot/projects/soc2008/gabor_textproc/grep/file.c?ac=64&rev1=40 >>> >>> And the problem occurs in grep_fgetln() when the buffers is being >>> filled in: >>> for (; i < bufsiz && !grep_feof(f); i++) >>> binbuf[i] = grep_fgetc(f); >>> >>> Thanks in advance, >>> >> >> Sign extension bug? >> > I tried to substitute everything with int, because fgetc can return some > error code afaik, but using int didn't help. Is binbuf[] an array of char or unsigned char? If it's signed char then you may want something like ch = binbufptr[0] & 0xff I guess. Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAkm26JUACgkQi+vbBBjt66CePwCgtXlqAYcdP6G1EUUtGk0nu7vD I1sAoIJ+Hpop5mIHDdbfcXAbwMsqht2P =A8DH -----END PGP SIGNATURE----- From gabor at FreeBSD.org Tue Mar 10 15:46:47 2009 From: gabor at FreeBSD.org (=?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?=) Date: Tue Mar 10 15:46:54 2009 Subject: fgetc doubts In-Reply-To: <49B6E895.9040701@delphij.net> References: <49B6DC95.9070607@FreeBSD.org> <20090310215512.GI31961@hoeg.nl> <49B6E30F.7020205@FreeBSD.org> <49B6E895.9040701@delphij.net> Message-ID: <49B6EDD1.5070602@FreeBSD.org> Xin LI escribi?: > > Is binbuf[] an array of char or unsigned char? If it's signed char then > you may want something like ch = binbufptr[0] & 0xff I guess. > Hi, thanks, it is now satrting to work, binbuf was of signed int. But now, I've got one more of that strange character at the end of the output, while it's not really there. -- Gabor Kovesdan FreeBSD Volunteer EMAIL: gabor@FreeBSD.org .:|:. gabor@kovesdan.org WEB: http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org From jimmy at mammothcheese.ca Tue Mar 10 15:53:15 2009 From: jimmy at mammothcheese.ca (James Bailie) Date: Tue Mar 10 15:53:22 2009 Subject: fgetc doubts In-Reply-To: <49B6DC95.9070607@FreeBSD.org> References: <49B6DC95.9070607@FreeBSD.org> Message-ID: <49B6E91A.4040604@mammothcheese.ca> fgetc() returns an int so that EOF may be distinguished from valid return values. Valid values are 8-bit values. EOF is a 32-bit value. EOF is a 32-bit two's-complement -1 (0xffffffff), and -1 input is 8-bit two's-complement -1 (0xff). When fgetc() casts this to an int, it becomes 0x000000ff, or 255, and thus the two values may be distinguished from each other. I haven't looked at your code, but you are probably comparing EOF with the value returned by fgetc() after it has been cast to a char. EOF is getting cast to a char implicitly in the comparison, so the comparison becomes a comparison between 0xff and 0xff. You need to test the int returned by fgetc() for EOF before assigning it to a char. G?bor K?vesd?n wrote: > Hello, > > I have a problem when reading files with fgetc when a 0xff character > comes. In my code the reading stops at that point as if EOF had been > reached, but that's not actually the case. > The code is here: > http://p4web.freebsd.org/@md=d&cd=//&c=Nsd@//depot/projects/soc2008/gabor_textproc/grep/file.c?ac=64&rev1=40 > > And the problem occurs in grep_fgetln() when the buffers is being filled > in: > for (; i < bufsiz && !grep_feof(f); i++) > binbuf[i] = grep_fgetc(f); > > Thanks in advance, > -- James Bailie http://www.mammothcheese.ca From jimmy at mammothcheese.ca Tue Mar 10 15:59:19 2009 From: jimmy at mammothcheese.ca (James Bailie) Date: Tue Mar 10 15:59:26 2009 Subject: fgetc doubts In-Reply-To: <49B6E91A.4040604@mammothcheese.ca> References: <49B6DC95.9070607@FreeBSD.org> <49B6E91A.4040604@mammothcheese.ca> Message-ID: <49B6F0C5.8050706@mammothcheese.ca> I must correct myself. It's more likely the return value of fgetc(), after having been assigned to a char, is being sign-extended when that char is compared to the in EOF, so that the comparison becomes a comparison between 0xffffffff and 0xffffffff. James Bailie wrote: > ...EOF is getting cast to a char implicitly in the comparison, so the > comparison becomes a comparison between 0xff and 0xff. -- James Bailie http://www.mammothcheese.ca From gabor at FreeBSD.org Tue Mar 10 16:00:23 2009 From: gabor at FreeBSD.org (=?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?=) Date: Tue Mar 10 16:00:30 2009 Subject: fgetc doubts In-Reply-To: <49B6E91A.4040604@mammothcheese.ca> References: <49B6DC95.9070607@FreeBSD.org> <49B6E91A.4040604@mammothcheese.ca> Message-ID: <49B6F0FB.5020307@FreeBSD.org> James Bailie escribi?: > fgetc() returns an int so that EOF may be distinguished from valid > return values. Valid values are 8-bit values. EOF is a 32-bit value. > > EOF is a 32-bit two's-complement -1 (0xffffffff), and -1 input is 8-bit > two's-complement -1 (0xff). When fgetc() casts this to an int, it > becomes 0x000000ff, or 255, and thus the two values may be distinguished > from each other. > > I haven't looked at your code, but you are probably comparing EOF with > the value returned by fgetc() after it has been cast to a char. EOF is > getting cast to a char implicitly in the comparison, so the comparison > becomes a comparison between 0xff and 0xff. You need to test the int > returned by fgetc() for EOF before assigning it to a char. Thanks, I've found all the pieces of the puzzle from the three comments and it works now. -- Gabor Kovesdan FreeBSD Volunteer EMAIL: gabor@FreeBSD.org .:|:. gabor@kovesdan.org WEB: http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org From doconnor at gsoft.com.au Tue Mar 10 18:40:03 2009 From: doconnor at gsoft.com.au (Daniel O'Connor) Date: Tue Mar 10 18:40:11 2009 Subject: Debugging init process. In-Reply-To: References: Message-ID: <200903111209.58753.doconnor@gsoft.com.au> On Wednesday 11 March 2009 06:43:50 pluknet wrote: > 2009/3/10 vasanth raonaik : > > Hello Team, > > > > I need to debug init process. I am not able to attach init to gdb and it > > throws > > That is because init is a system process, which you cannot trace by design > (see ptrace(2)). Interesting, but it doesn't really help him debug it ;) Unless there is some other way around it you can stop the kernel making it a system process by editing /usr/src/sys/kern/init_main.c around line 730 (in create_init). Although some signal code seems to specialcase PID 1 so maybe that won't work either.. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090311/01ca3c39/attachment.pgp From neldredge at math.ucsd.edu Tue Mar 10 19:02:17 2009 From: neldredge at math.ucsd.edu (Nate Eldredge) Date: Tue Mar 10 19:02:24 2009 Subject: Debugging init process. In-Reply-To: References: Message-ID: On Tue, 10 Mar 2009, vasanth raonaik wrote: > Hello Team, > > I need to debug init process. I am not able to attach init to gdb and it > throws As others mentioned, this is explicitly disabled. You could re-enable it by hacking the kernel, but it could cause other unexpected problems. Alternatively, there's always "printf debugging". What is wrong with init, that you need to debug it? It's a fairly simple program that's been around for a long time and should be pretty stable. -- Nate Eldredge neldredge@math.ucsd.edu From dillon at apollo.backplane.com Tue Mar 10 22:13:52 2009 From: dillon at apollo.backplane.com (Matthew Dillon) Date: Tue Mar 10 22:13:59 2009 Subject: Google SoC 2009 Idea References: <49A5D6FC.1090800@freebsd.org> <49A6CF27.3000203@freebsd.org> <7d6fde3d0902260924r45ebb7c8i46cd6daf43a8171d@mail.gmail.com> Message-ID: <200903110502.n2B52Gdr008609@apollo.backplane.com> I'll put in a plug for using DragonFly's pluggable scheduler framework :-). We (DragonFly) also offer shell accounts, git integration and publishing, a virtual kernel build/run environment for doing kernel projects, and help over IRC and email. Someone with the gumption to post the idea to multiple project lists is probably going to be qualified to do the work. It would be quite hillarious to find the projects in a position to compete for SoC people. -Matt From v.haisman at sh.cvut.cz Wed Mar 11 01:46:16 2009 From: v.haisman at sh.cvut.cz (=?UTF-8?B?VsOhY2xhdiBIYWlzbWFu?=) Date: Wed Mar 11 01:46:23 2009 Subject: fgetc doubts In-Reply-To: <49B6DC95.9070607@FreeBSD.org> References: <49B6DC95.9070607@FreeBSD.org> Message-ID: <49B77A45.3000204@sh.cvut.cz> G?bor K?vesd?n wrote, On 10.3.2009 22:33: > Hello, > > I have a problem when reading files with fgetc when a 0xff character > comes. In my code the reading stops at that point as if EOF had been > reached, but that's not actually the case. > The code is here: > http://p4web.freebsd.org/@md=d&cd=//&c=Nsd@//depot/projects/soc2008/gabor_textproc/grep/file.c?ac=64&rev1=40 You have a bug in the grep_fgetc() function in the BZIP case. Char type is signed on FreeBSD and you are sign extending the c variable in the "return (c)" statement. The line should read "return ((unsigned char)c)", if you want to model the function using the same semantics as C99 fgetc(). > > And the problem occurs in grep_fgetln() when the buffers is being filled > in: > for (; i < bufsiz && !grep_feof(f); i++) > binbuf[i] = grep_fgetc(f); > > Thanks in advance, > -- VH From v.haisman at sh.cvut.cz Wed Mar 11 01:54:16 2009 From: v.haisman at sh.cvut.cz (=?UTF-8?B?VsOhY2xhdiBIYWlzbWFu?=) Date: Wed Mar 11 01:54:23 2009 Subject: fgetc doubts In-Reply-To: <49B6DC95.9070607@FreeBSD.org> References: <49B6DC95.9070607@FreeBSD.org> Message-ID: <49B77C28.9020504@sh.cvut.cz> G?bor K?vesd?n wrote, On 10.3.2009 22:33: > Hello, [...] > And the problem occurs in grep_fgetln() when the buffers is being filled > in: > for (; i < bufsiz && !grep_feof(f); i++) > binbuf[i] = grep_fgetc(f); > Also, why are you filling the buffer char by char? All of the input streams you have there support reading by chunks. -- VH From Alexander at Leidinger.net Wed Mar 11 00:51:50 2009 From: Alexander at Leidinger.net (Alexander Leidinger) Date: Wed Mar 11 04:20:24 2009 Subject: Debugging init process. In-Reply-To: References: Message-ID: <20090311085138.23982deb8g8234w0@webmail.leidinger.net> Quoting Nate Eldredge (from Tue, 10 Mar 2009 19:02:16 -0700 (PDT)): > On Tue, 10 Mar 2009, vasanth raonaik wrote: > >> Hello Team, >> >> I need to debug init process. I am not able to attach init to gdb and it >> throws > > As others mentioned, this is explicitly disabled. You could > re-enable it by hacking the kernel, but it could cause other > unexpected problems. > > Alternatively, there's always "printf debugging". > > What is wrong with init, that you need to debug it? It's a fairly > simple program that's been around for a long time and should be > pretty stable. If this is on -current and depending on the problem, dtrace may be an option (I don't know if it special-cases init or not). Bye, Alexander. -- Don't interfere with the stranger's style. http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From ed at 80386.nl Wed Mar 11 06:05:59 2009 From: ed at 80386.nl (Ed Schouten) Date: Wed Mar 11 06:06:06 2009 Subject: [PATCH] Support for thresholds in du(1) In-Reply-To: <200902251724.40212.fbsd.hackers@rachie.is-a-geek.net> References: <200902251724.40212.fbsd.hackers@rachie.is-a-geek.net> Message-ID: <20090311130557.GJ31961@hoeg.nl> * Mel wrote: > Example usage: > # du -xht 20m . > 29M ./contrib/binutils > 52M ./contrib/gcc > 237M ./contrib > 35M ./crypto > 28M ./lib > 20M ./share > 55M ./sys/dev > 139M ./sys > 545M . Ooh! That looks awesome! -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090311/82795c1f/attachment.pgp From rwatson at FreeBSD.org Wed Mar 11 06:13:35 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Wed Mar 11 06:13:41 2009 Subject: Debugging init process. In-Reply-To: References: Message-ID: On Tue, 10 Mar 2009, Nate Eldredge wrote: > On Tue, 10 Mar 2009, vasanth raonaik wrote: > >> I need to debug init process. I am not able to attach init to gdb and it >> throws > > As others mentioned, this is explicitly disabled. You could re-enable it by > hacking the kernel, but it could cause other unexpected problems. > > Alternatively, there's always "printf debugging". > > What is wrong with init, that you need to debug it? It's a fairly simple > program that's been around for a long time and should be pretty stable. One specific concern with debugging init is that the ptracing application will intercept all signals destined for the target application, and since signals are used to manage the system run cycle, and init becomes the parent of orphaned processes, this could lead to unexpected side effects. Also, if init exits, your system will panic :-). Last time I mucked with init, I found the best ways to debug it were: (1) Provide a LD_PRELOAD for libc that replaces getpid(), getppid(), and other things to return pid 1. (2) Run init in a jail so that if it exits, the box won't panic. (3) Use tools like ktrace on the actual init, combined with utrace() instrumentation of init so you can track its behavior "in the wild" FYI, if you do want to clear P_SYSTM, one easy way to do that is to attach kgdb to /dev/mem, and directly manipulate the flags on initproc. This comes with some risks, of course. :-) Robert N M Watson Computer Laboratory University of Cambridge From pluknet at gmail.com Wed Mar 11 06:38:42 2009 From: pluknet at gmail.com (pluknet) Date: Wed Mar 11 06:38:50 2009 Subject: Non-existing p_vmspace. When is it possible? In-Reply-To: References: Message-ID: Hi. I perform in FOREACH_PROC_IN_SYSTEM(), where I conditionally look at p_vmspace internals. I'd like to know the safe way to reference p_vmspace fields without potential null-dereference. I see an example in vm_pageout_oom(), where making access to p->p_vmspace fields is done without additional checks. Nevertheless I want to further catch on how it works. Currently I additionally explicitly check on P_SYSTEM and PRS_NEW, then p->p_vmspace against NULL. So I'd wish to understand if a time-window between 1) placing a new process to proclist ?and 2) attaching vmspace to this process is possible at all, and then in what cases. I see in fork1() that a new process' (named p2 here) state is set to PRS_NEW just before LIST_INSERT_HEAD(&allproc, p2, p_list) and then (after vmspace is already attached in vm_forkproc()) is set to PRS_NORMAL. So an additional check on p_vmspace != NULL is not need. Am I right? Thanks. -- wbr, pluknet From kostikbel at gmail.com Wed Mar 11 07:38:40 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Mar 11 07:38:47 2009 Subject: Non-existing p_vmspace. When is it possible? In-Reply-To: References: Message-ID: <20090311143830.GJ41617@deviant.kiev.zoral.com.ua> On Wed, Mar 11, 2009 at 04:38:39PM +0300, pluknet wrote: > Hi. > > I perform in FOREACH_PROC_IN_SYSTEM(), where I conditionally > look at p_vmspace internals. I'd like to know the safe way to > reference p_vmspace fields without potential null-dereference. > > I see an example in vm_pageout_oom(), where making access to > p->p_vmspace fields is done without additional checks. > Nevertheless I want to further catch on how it works. > > Currently I additionally explicitly check on P_SYSTEM and PRS_NEW, > then p->p_vmspace against NULL. > > So I'd wish to understand if a time-window between > 1) placing a new process to proclist > ?and > 2) attaching vmspace to this process > is possible at all, and then in what cases. > > I see in fork1() that a new process' (named p2 here) state is set to > PRS_NEW just before LIST_INSERT_HEAD(&allproc, p2, p_list) and then > (after vmspace is already attached in vm_forkproc()) is set to PRS_NORMAL. > > So an additional check on p_vmspace != NULL is not need. > Am I right? The canonical sequence of doing this is, assuming p is a held pointer to a process: vm = vmspace_acquire_ref(p); if (vm == NULL) { PRELE(p); return ?; } use vm; vmspace_free(vm); Look around the tree for the vmspace_acquire_ref usage. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090311/36733197/attachment.pgp From marius at nuenneri.ch Wed Mar 11 08:16:44 2009 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Wed Mar 11 08:16:52 2009 Subject: Debugging init process. In-Reply-To: <20090311085138.23982deb8g8234w0@webmail.leidinger.net> References: <20090311085138.23982deb8g8234w0@webmail.leidinger.net> Message-ID: On Wed, Mar 11, 2009 at 08:51, Alexander Leidinger wrote: > Quoting Nate Eldredge (from Tue, 10 Mar 2009 > 19:02:16 -0700 (PDT)): > >> On Tue, 10 Mar 2009, vasanth raonaik wrote: >> >>> Hello Team, >>> >>> I need to debug init process. I am not able to attach init to gdb and it >>> throws >> >> As others mentioned, this is explicitly disabled. ?You could re-enable it >> by hacking the kernel, but it could cause other unexpected problems. >> >> Alternatively, there's always "printf debugging". >> >> What is wrong with init, that you need to debug it? ?It's a fairly simple >> program that's been around for a long time and should be pretty stable. > > If this is on -current and depending on the problem, dtrace may be an option > (I don't know if it special-cases init or not). > DTrace is not available for userland processes yet. From tijl at ulyssis.org Thu Mar 12 03:54:04 2009 From: tijl at ulyssis.org (Tijl Coosemans) Date: Thu Mar 12 03:54:12 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49B04281.2030406@gmx.net> References: <200903041938.n24Jcqdr060153@lurza.secnetix.de> <49AF9381.50709@FreeBSD.org> <49B04281.2030406@gmx.net> Message-ID: <200903121124.33358.tijl@ulyssis.org> On Thursday 05 March 2009 22:22:09 Daniel Thiele wrote: > Looking at the numbers in the Hitachi drive specifications Tobias an I > dug out from the Hitachi website (see replies in the Joerg Sonnenberger > branch of this thread) the normal Load/Unload count is about 30 times > higher than the Emergency Unload count. So even if an > ATA_STANDBY_IMMEDIATE command may introduce additional Load/Unload > stress on reboot it is not as bad as the stress causes by an Emergency > Unload on shutdown. Of course this only applies if the "click" sound is > really caused by an Emergency Unload. Is there a way to figure out? > Maybe the S.M.A.R.T. feature records the two kinds of power-offs. Emergency Unload is called Power-Off_Retract_Count in SMART output. From tijl at ulyssis.org Thu Mar 12 04:30:24 2009 From: tijl at ulyssis.org (Tijl Coosemans) Date: Thu Mar 12 04:30:31 2009 Subject: Spin down HDD after disk sync or before power off In-Reply-To: <49B02211.1010809@abitos.org> References: <200903050758.n257wod8088426@lurza.secnetix.de> <49B02211.1010809@abitos.org> Message-ID: <200903121230.17041.tijl@ulyssis.org> On Thursday 05 March 2009 20:03:45 Tobias Blersch wrote: > http://www.hitachigst.com/tech/techlib.nsf/techdocs/28DCCB17E0EEC5A086256F4E006E2F5B > > Thats the specification for my notebooks hard drive. Section 6.6 > Reliability gives data about how to power-off the disk. It also > contains numbers of supported load/unloads and emergency unloads. > Emergency unloads are invoked when the heads are still loaded and > power fails. Quoting that document: 10.4.1 Emergency unload (...) Emergency unload is intended to be invoked in rare situations. Because this operation is inherently uncontrolled, it is more mechanically stressful than a normal unload. A single emergency unload operation is more stressful than 100 normal unloads. Use of emergency unload reduces the start/stop life of the HDD at a rate at least 100X faster than that of normal unload, and may damage the HDD. 10.4.2 Required power-off sequence (...) You may then turn off the HDD in the following order: ? Issue Standby Immediate or sleep command. ? Wait until COMMAND COMPLETE STATUS is returned. (It may take up to 350ms in typical case). ? Terminate power to HDD. This power-down sequence should be followed for entry into any system power-down state, or system suspend state, or system hibernation state. In a robustly designed system, emergency unload is limited to rare scenarios such as battery removal during operation. From Alexander at Leidinger.net Thu Mar 12 03:42:10 2009 From: Alexander at Leidinger.net (Alexander Leidinger) Date: Thu Mar 12 04:35:55 2009 Subject: Debugging init process. In-Reply-To: References: <20090311085138.23982deb8g8234w0@webmail.leidinger.net> Message-ID: <20090312114155.19306tatpgd1ebk0@webmail.leidinger.net> Quoting Marius N?nnerich (from Wed, 11 Mar 2009 15:54:44 +0100): > On Wed, Mar 11, 2009 at 08:51, Alexander Leidinger > wrote: >> Quoting Nate Eldredge (from Tue, 10 Mar 2009 >> 19:02:16 -0700 (PDT)): >> >>> On Tue, 10 Mar 2009, vasanth raonaik wrote: >>> >>>> Hello Team, >>>> >>>> I need to debug init process. I am not able to attach init to gdb and it >>>> throws >>> >>> As others mentioned, this is explicitly disabled. ?You could re-enable it >>> by hacking the kernel, but it could cause other unexpected problems. >>> >>> Alternatively, there's always "printf debugging". >>> >>> What is wrong with init, that you need to debug it? ?It's a fairly simple >>> program that's been around for a long time and should be pretty stable. >> >> If this is on -current and depending on the problem, dtrace may be an option >> (I don't know if it special-cases init or not). >> > > DTrace is not available for userland processes yet. Depending on what is needed (it may not be needed to attach gdb, it may be sufficient to have something like ktrace, the OP didn't specify the problem), DTrace may suit the needs. Bye, Alexander. -- Your true value depends entirely on what you are compared with. http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From lavalamp at spiritual-machines.org Thu Mar 12 09:57:58 2009 From: lavalamp at spiritual-machines.org (Brian A. Seklecki) Date: Thu Mar 12 10:19:56 2009 Subject: shmmax tops out at 2G? In-Reply-To: <1235404207.31655.2085.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> References: <1235404207.31655.2085.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> Message-ID: <1236877076.15167.3946.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> Thanks to all; with the r1.114 changes, our staff reports the following: "Postgres is able to start with a ~3GB postgresql.conf(5) $shared_buffer on 8-CURRENT/amd64: PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 1036 pgsql 1 44 0 3013M 79296K select 0:00 0.00% postgres kern.ipc.shmall: 786432 kern.ipc.shmmax: 3221225472 FreeBSD db0X 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Thu Mar 12 09:38:36 EDT 2009 foo@db02:/usr/obj/usr/src/sys/GENERIC amd64 ~BAS On Mon, 2009-02-23 at 10:50 -0500, Brian A. Seklecki wrote: > > On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote: > > >In response to Bill Moran : > > >> sysctl kern.ipc.shmmax=2200000000 > > >> kern.ipc.shmmax: 2100000000 -> -2094967296 > > Someone was nice enough to file a PR related to this: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274 > > We'd be happy to sponsor development in -current to address this > limitation. ~BAS From ivoras at freebsd.org Fri Mar 13 02:50:15 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Mar 13 02:50:23 2009 Subject: shmmax tops out at 2G? In-Reply-To: <1236877076.15167.3946.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> References: <1235404207.31655.2085.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <1236877076.15167.3946.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> Message-ID: Brian A. Seklecki wrote: > Thanks to all; with the r1.114 changes, our staff reports the following: > > "Postgres is able to start with a ~3GB postgresql.conf(5) $shared_buffer > on 8-CURRENT/amd64: It has recently also been MFC-ed to 7-STABLE :) (beware of instabilities and debugging in -CURRENT!) > PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND > 1036 pgsql 1 44 0 3013M 79296K select 0:00 0.00% postgres > > kern.ipc.shmall: 786432 > kern.ipc.shmmax: 3221225472 > > FreeBSD db0X 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Thu Mar 12 09:38:36 EDT > 2009 foo@db02:/usr/obj/usr/src/sys/GENERIC amd64 > > ~BAS > > On Mon, 2009-02-23 at 10:50 -0500, Brian A. Seklecki wrote: >>> On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote: >>>> In response to Bill Moran : >>>>> sysctl kern.ipc.shmmax=2200000000 >>>>> kern.ipc.shmmax: 2100000000 -> -2094967296 >> Someone was nice enough to file a PR related to this: >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274 >> >> We'd be happy to sponsor development in -current to address this >> limitation. ~BAS > > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090313/0cca1d4a/signature.pgp From pekka.nikander at nomadiclab.com Fri Mar 13 05:52:51 2009 From: pekka.nikander at nomadiclab.com (Pekka Nikander) Date: Fri Mar 13 05:53:04 2009 Subject: Problems mapping an vm_object to a process memory space Message-ID: <37364E21-701A-42F9-95B7-1B3386AEDE71@nomadiclab.com> As a part of a research project, I'm trying to build publish/subscribe shared memory semantics where the idea is to first map an vm_object as read/write to a publisher's memory space, and then a COW shadow of that later to the subscriber processes' memory space. I've got to the point where the code works most of the time, but at certain scenarios (which are hard to classify and seem slightly random) the mapping goes wrong, and either the subscriber process has no physical mapping at the supposed address or there appears some random page. To me it appears as if the vm_object, vm_map etc data structures are OK, but somehow the pmaps don't get right. I'm currently using 7.1 RELEASE on amd64, but I'm planning to try the same on -CURRENT as soon as I get it properly ported. I even tried calling pmap_enter_object explicitly before returning to the user space, but it doesn't seem to help. Another thing is that there may be some bugs related OBJ_ONEMAPPING. We need to explicitly clear it at places, and sometimes artificially bump up the vm_object reference count to avoid code related to ONEMAPPING from trashing the object's mappings. Is this a known issue? Any advice? --Pekka Nikander From avg at icyb.net.ua Fri Mar 13 06:27:04 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Mar 13 06:27:11 2009 Subject: intpm: multiple salves, collision ?? Message-ID: <49BA5F24.1020105@icyb.net.ua> I observe some quite odd behavior with intpm(4). I have a program that access two slaves at a high rate (no sleeps or long calculations). The typical pattern of access is: 1. SMB_WRITEB slave1 reg1 2. SMB_READB slave1 reg2 3. SMB_READB *slave2* reg 4. SMB_READB slave1 reg3 There are many iterations of this pattern in a tight loop. There is definitely only one entity that uses SMBus - no other userland programs, nothing in kernel, nothing in ACPI and BIOS. At random iteration smb ioctl would fail with EIO. This happens consistently at step 4. Debugging printf in intpm gives this: intsmb0: error = 8, status = 0x8 That is, PIIX4_SMBHSTSTAT_BUSC translated to SMB_ECOLLI. Error can not be reproduced if only one slave is accessed, no matter in what patterns. -- Andriy Gapon From avg at freebsd.org Fri Mar 13 07:49:37 2009 From: avg at freebsd.org (Andriy Gapon) Date: Fri Mar 13 07:49:43 2009 Subject: intpm: multiple salves, collision ?? In-Reply-To: <49BA5F24.1020105@icyb.net.ua> References: <49BA5F24.1020105@icyb.net.ua> Message-ID: <49BA6DD0.1080407@freebsd.org> on 13/03/2009 15:27 Andriy Gapon said the following: > I observe some quite odd behavior with intpm(4). > I have a program that access two slaves at a high rate (no sleeps or > long calculations). The typical pattern of access is: > 1. SMB_WRITEB slave1 reg1 > 2. SMB_READB slave1 reg2 > 3. SMB_READB *slave2* reg > 4. SMB_READB slave1 reg3 > > There are many iterations of this pattern in a tight loop. > There is definitely only one entity that uses SMBus - no other userland > programs, nothing in kernel, nothing in ACPI and BIOS. > > At random iteration smb ioctl would fail with EIO. This happens > consistently at step 4. Debugging printf in intpm gives this: > intsmb0: error = 8, status = 0x8 > That is, PIIX4_SMBHSTSTAT_BUSC translated to SMB_ECOLLI. > > Error can not be reproduced if only one slave is accessed, no matter in > what patterns. > Sorry for the noise, the problem seems to be in misbehavior on part of one of the slaves used by the original program. I wrote a minimalistic test program and ran it for several different combinations of slaves - the issue only occurs if a certain slave is accessed, no problems for any other slaves. I wonder what could be wrong with that slave. -- Andriy Gapon From avg at freebsd.org Fri Mar 13 08:16:25 2009 From: avg at freebsd.org (Andriy Gapon) Date: Fri Mar 13 08:16:32 2009 Subject: intpm: multiple salves, collision [solved] In-Reply-To: <49BA6DD0.1080407@freebsd.org> References: <49BA5F24.1020105@icyb.net.ua> <49BA6DD0.1080407@freebsd.org> Message-ID: <49BA78C6.30202@freebsd.org> on 13/03/2009 16:29 Andriy Gapon said the following: > > Sorry for the noise, the problem seems to be in misbehavior on part of one of the > slaves used by the original program. I wrote a minimalistic test program and ran > it for several different combinations of slaves - the issue only occurs if a > certain slave is accessed, no problems for any other slaves. > I wonder what could be wrong with that slave. > Mystery is cleared - I should have used READW to access that slave. The slave is a little bit sloppy - it accepts READB but replies with two bytes of data, and its documentation implies that READB can be used, but I should have known better. -- Andriy Gapon From chris at smartt.com Fri Mar 13 11:14:59 2009 From: chris at smartt.com (Chris St Denis) Date: Fri Mar 13 11:15:06 2009 Subject: Bug in tcp wrappers? Message-ID: <49BA9E63.3040000@smartt.com> I think I've found a bug in libwrap/tcpwrappers. Before filing an actual bug report I want to get some feedback here first. A hosts.allow file with ~1000 ips on a single line (Haven't experimented with other quantities yet), causes network daemons that use libwrap stop accepting incoming network connections and use 100% cpu on an incoming connection. This problem appeared because sshguard placed a large number of IPs in my hosts.allow file triggering this bug. I've left the affected daemons for a long period of time (once about 8 hours) and they don't seem to come back, so I think this is more than just it taking a while to loop through a 1000 item array of IPs The production system that was affected is FreeBSD 7.0-32bit Test system is FreeBSD 7.1-32bit Example hosts.allow file (IPs are randomly generated for purposes of example) sshd : 112.110.123.63 113.11.2.126 113.11.8.6 113.19.19.22 113.197.48.68 116.48.108.244 116.48.11.19 : deny ALL : ALL : allow top output of affected system. sshd wcpu slowly crawls up to 100% over about 30 seconds or so. crash# top last pid: 692; load averages: 0.08, 0.04, 0.04 up 0+00:12:13 15:42:30 24 processes: 2 running, 22 sleeping CPU: 49.7% user, 0.0% nice, 0.2% system, 0.2% interrupt, 49.9% idle Mem: 9304K Active, 6004K Inact, 21M Wired, 32K Cache, 10M Buf, 947M Free Swap: 1995M Total, 1995M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 691 root 1 103 0 5760K 3660K CPU1 1 0:04 33.98% sshd 672 root 1 4 0 8436K 3888K sbwait 1 0:00 0.00% sshd 677 cstdenis 1 20 0 4460K 2288K pause 0 0:00 0.00% csh 682 root 1 20 0 5484K 2632K pause 0 0:00 0.00% csh 675 cstdenis 1 44 0 8436K 3896K select 0 0:00 0.00% sshd A backtrace shows crash# gdb /usr/sbin/sshd 691 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Attaching to program: /usr/sbin/sshd, process 691 Reading symbols from /usr/lib/libssh.so.4...done. Loaded symbols for /usr/lib/libssh.so.4 Reading symbols from /lib/libutil.so.7...done. Loaded symbols for /lib/libutil.so.7 Reading symbols from /lib/libz.so.4...done. Loaded symbols for /lib/libz.so.4 Reading symbols from /usr/lib/libwrap.so.5...done. Loaded symbols for /usr/lib/libwrap.so.5 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at /usr/src/lib/libc/stdio/fgets.c:56 56 { (gdb) bt #0 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at /usr/src/lib/libc/stdio/fgets.c:56 #1 0x281124ee in xgets (ptr=0xbfbfe67b "", len=1, fp=0x283b8040) at /usr/src/lib/libwrap/../../contrib/tcp_wrappers/misc.c:38 #2 0x28111410 in table_match (table=0x28112c5c "/etc/hosts.allow", request=0xbfbfeb14) at /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:162 #3 0x28111540 in hosts_access (request=0xbfbfeb14) at /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:132 #4 0x08052b39 in main (ac=2, av=0xbfbfeecc) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1843 (gdb) bt #0 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at /usr/src/lib/libc/stdio/fgets.c:56 #1 0x281124ee in xgets (ptr=0xbfbfe67b "", len=1, fp=0x283b8040) at /usr/src/lib/libwrap/../../contrib/tcp_wrappers/misc.c:38 #2 0x28111410 in table_match (table=0x28112c5c "/etc/hosts.allow", request=0xbfbfeb14) at /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:162 #3 0x28111540 in hosts_access (request=0xbfbfeb14) at /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:132 #4 0x08052b39 in main (ac=2, av=0xbfbfeecc) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1843 (gdb) q The program is running. Quit anyway (and detach it)? (y or n) y Detaching from program: /usr/sbin/sshd, process 691 A few questions 1. Is this a known issue of any sort? I've done some searching on it, but haven't found anything of interest. 2. Should this be reported to FreeBSD bug tracker, or to libwrap (or both)? Basically, is FreeBSD's libwrap (more or less) in sync with the main one, or is it completely separate? -- Chris St Denis Programmer SmarttNet (www.smartt.com) Ph: 604-473-9700 Ext. 200 ------------------------------------------- "Smart Internet Solutions For Businesses" From bsd.quest at googlemail.com Fri Mar 13 11:18:49 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Fri Mar 13 11:18:55 2009 Subject: Problems mapping an vm_object to a process memory space In-Reply-To: <37364E21-701A-42F9-95B7-1B3386AEDE71@nomadiclab.com> References: <37364E21-701A-42F9-95B7-1B3386AEDE71@nomadiclab.com> Message-ID: <671bb5fc0903131118u31b5b9b6l46b5d063aee78ff0@mail.gmail.com> hi , I had a problem with remapping too. Could I see your code? here is my code, that some times on AMD64 runs wrong : http://pastebin.com/m78da0b37 And now I solved the problem with remapping by using /dev/mem device. It has mmap syscal. And it seems to be working without problem. Alexej < 2009/3/13 Pekka Nikander > As a part of a research project, I'm trying to build publish/subscribe > shared memory semantics where the idea is to first map an vm_object as > read/write to a publisher's memory space, and then a COW shadow of that > later to the subscriber processes' memory space. > > I've got to the point where the code works most of the time, but at certain > scenarios (which are hard to classify and seem slightly random) the mapping > goes wrong, and either the subscriber process has no physical mapping at the > supposed address or there appears some random page. To me it appears as if > the vm_object, vm_map etc data structures are OK, but somehow the pmaps > don't get right. I'm currently using 7.1 RELEASE on amd64, but I'm planning > to try the same on -CURRENT as soon as I get it properly ported. I even > tried calling pmap_enter_object explicitly before returning to the user > space, but it doesn't seem to help. > > Another thing is that there may be some bugs related OBJ_ONEMAPPING. We > need to explicitly clear it at places, and sometimes artificially bump up > the vm_object reference count to avoid code related to ONEMAPPING from > trashing the object's mappings. Is this a known issue? > > Any advice? > > --Pekka Nikander > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From assaulter0x80 at gmail.com Fri Mar 13 13:36:57 2009 From: assaulter0x80 at gmail.com (Jacky Oh) Date: Fri Mar 13 13:37:07 2009 Subject: sys/vimage.h and net/opt_inet6 not exist on 7.1 release? Message-ID: Hi, Im trying to compile sys/netinet/ip_fw_pfil.c as example of KLD firewall module. This file include net/opt_inet6.h and sys/vimage.h but the compiler dont find it. This files seems that was deleted from the source tree. Anyone know something?. Thanks From assaulter0x80 at gmail.com Fri Mar 13 13:38:00 2009 From: assaulter0x80 at gmail.com (Jacky Oh) Date: Fri Mar 13 13:38:07 2009 Subject: Fwd: sys/vimage.h and net/opt_inet6 not exist on 7.1 release? In-Reply-To: References: Message-ID: Hi, Im trying to compile sys/netinet/ip_fw_pfil.c as example of KLD firewall module. This file include net/opt_inet6.h and sys/vimage.h but the compiler dont find it. This files seems that was deleted from the source tree. Anyone know something?. Thanks From pekka.nikander at nomadiclab.com Fri Mar 13 14:20:00 2009 From: pekka.nikander at nomadiclab.com (Pekka Nikander) Date: Fri Mar 13 14:20:32 2009 Subject: Problems mapping an vm_object to a process memory space In-Reply-To: <671bb5fc0903131118u31b5b9b6l46b5d063aee78ff0@mail.gmail.com> References: <37364E21-701A-42F9-95B7-1B3386AEDE71@nomadiclab.com> <671bb5fc0903131118u31b5b9b6l46b5d063aee78ff0@mail.gmail.com> Message-ID: Hi Alexej, The actual mapping code is now at http://pastebin.com/m56a949a5 The objects in question are allocated through vm_pager_allocate with OBJT_SWAP. Note that I'm not sure when OBJ_ONEMAPPING clearing actually helps and when not -- I've more sprinkled it around the code in the hope of circumventing what I suspect is a bug. (But I also have to confess that I don't understand the internals of vm_object_deallocate well enough to really say where the bug might be, if there is one.) The code around lines 110-117 is my latest attempt to fix. The earlier version simply wired the pages. --Pekka On 13 Mar 2009, at 20:18, Alexej Sokolov wrote: > hi , > I had a problem with remapping too. Could I see your code? > here is my code, that some times on AMD64 runs wrong : > http://pastebin.com/m78da0b37 > > And now I solved the problem with remapping by using /dev/mem > device. It has mmap syscal. And it seems to be working without > problem. > > Alexej > < > > 2009/3/13 Pekka Nikander > As a part of a research project, I'm trying to build publish/ > subscribe shared memory semantics where the idea is to first map an > vm_object as read/write to a publisher's memory space, and then a > COW shadow of that later to the subscriber processes' memory space. > > I've got to the point where the code works most of the time, but at > certain scenarios (which are hard to classify and seem slightly > random) the mapping goes wrong, and either the subscriber process > has no physical mapping at the supposed address or there appears > some random page. To me it appears as if the vm_object, vm_map etc > data structures are OK, but somehow the pmaps don't get right. I'm > currently using 7.1 RELEASE on amd64, but I'm planning to try the > same on -CURRENT as soon as I get it properly ported. I even tried > calling pmap_enter_object explicitly before returning to the user > space, but it doesn't seem to help. > > Another thing is that there may be some bugs related > OBJ_ONEMAPPING. We need to explicitly clear it at places, and > sometimes artificially bump up the vm_object reference count to > avoid code related to ONEMAPPING from trashing the object's > mappings. Is this a known issue? > > Any advice? > > --Pekka Nikander > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org > " > From rwatson at FreeBSD.org Sat Mar 14 09:15:03 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sat Mar 14 09:15:09 2009 Subject: sys/vimage.h and net/opt_inet6 not exist on 7.1 release? In-Reply-To: References: Message-ID: On Fri, 13 Mar 2009, Jacky Oh wrote: > Im trying to compile sys/netinet/ip_fw_pfil.c as example of KLD firewall > module. This file include net/opt_inet6.h and sys/vimage.h but the compiler > dont find it. This files seems that was deleted from the source tree. Hi Jacky-- vimage.h, at least, should exist only in 8.x -- could you check that you're checking out the right version of the file for use on 7.x? Robert N M Watson Computer Laboratory University of Cambridge From ady at freebsd.ady.ro Sun Mar 15 12:33:46 2009 From: ady at freebsd.ady.ro (Adrian Penisoara) Date: Sun Mar 15 12:33:53 2009 Subject: ETA for ZFS v. 13 Merge From HEAD ? Message-ID: <78cb3d3f0903151209r46837d70m914a23e30a19060e@mail.gmail.com> Hi Pawel, Coming back to the subject, when do you think we might have a merge of r185029 (import of ZFS version 13) from head back into -stable ? Is there anything we can help with to speed up the process (e.g. testing) ? PS: ZFS-FUSE on Linux has also reached v 13... Thank you, Adrian Penisoara ROFUG / EnterpriseBSD --------------------------- Date: Wed, 26 Nov 2008 10:52:41 +0100 From: Pawel Jakub Dawidek Subject: Re: svn commit: r185029 - in head: cddl/compat/opensolaris/include cddl/compat/opensolaris/misc cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/cmd/zinject cd... To: Attila Nagy Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Message-ID: <20081126095241.GA3188@garage.freebsd.pl> Content-Type: text/plain; charset="us-ascii" On Wed, Nov 26, 2008 at 10:15:58AM +0100, Attila Nagy wrote: > Hello, > > Pawel Jakub Dawidek wrote: > >Author: pjd > >Date: Mon Nov 17 20:49:29 2008 > >New Revision: 185029 > >URL: http://svn.freebsd.org/changeset/base/185029 > > > >Log: > > Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. > > > This, and other changes stabilized ZFS by a great level in HEAD. > Do you plan to MFC these to 7-STABLE? Yes, but ETA yet. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! From mbsd at pacbell.net Sun Mar 15 14:55:58 2009 From: mbsd at pacbell.net (=?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?=) Date: Sun Mar 15 14:56:05 2009 Subject: Bug in tcp wrappers? In-Reply-To: <49BA9E63.3040000@smartt.com> References: <49BA9E63.3040000@smartt.com> Message-ID: <20090315144440.N24160@antec.home> Hi Chris, On Fri, 13 Mar 2009, Chris St Denis wrote: > I think I've found a bug in libwrap/tcpwrappers. I think so too :) See below. > Before filing an actual bug report I want to get some feedback here > first. > > A hosts.allow file with ~1000 ips on a single line (Haven't experimented with > other quantities yet), causes network daemons that use libwrap stop accepting > incoming network connections and use 100% cpu on an incoming connection. > This problem appeared because sshguard placed a large number of IPs in my > hosts.allow file triggering this bug. > > I've left the affected daemons for a long period of time (once about 8 hours) > and they don't seem to come back, so I think this is more than just it taking > a while to loop through a 1000 item array of IPs > > > The production system that was affected is FreeBSD 7.0-32bit > Test system is FreeBSD 7.1-32bit > > Example hosts.allow file (IPs are randomly generated for purposes of example) > > sshd : 112.110.123.63 113.11.2.126 113.11.8.6 113.19.19.22 > 113.197.48.68 116.48.108.244 116.48.11.19 : deny > ALL : ALL : allow > > top output of affected system. sshd wcpu slowly crawls up to 100% over about > 30 seconds or so. > > crash# top > last pid: 692; load averages: 0.08, 0.04, 0.04 > up > 0+00:12:13 15:42:30 > 24 processes: 2 running, 22 sleeping > CPU: 49.7% user, 0.0% nice, 0.2% system, 0.2% interrupt, 49.9% idle > Mem: 9304K Active, 6004K Inact, 21M Wired, 32K Cache, 10M Buf, 947M Free > Swap: 1995M Total, 1995M Free > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 691 root 1 103 0 5760K 3660K CPU1 1 0:04 33.98% sshd > 672 root 1 4 0 8436K 3888K sbwait 1 0:00 0.00% sshd > 677 cstdenis 1 20 0 4460K 2288K pause 0 0:00 0.00% csh > 682 root 1 20 0 5484K 2632K pause 0 0:00 0.00% csh > 675 cstdenis 1 44 0 8436K 3896K select 0 0:00 0.00% sshd > > > A backtrace shows > > crash# gdb /usr/sbin/sshd 691 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and > you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "i386-marcel-freebsd"... > Attaching to program: /usr/sbin/sshd, process 691 > Reading symbols from /usr/lib/libssh.so.4...done. > Loaded symbols for /usr/lib/libssh.so.4 > Reading symbols from /lib/libutil.so.7...done. > Loaded symbols for /lib/libutil.so.7 > Reading symbols from /lib/libz.so.4...done. > Loaded symbols for /lib/libz.so.4 > Reading symbols from /usr/lib/libwrap.so.5...done. > Loaded symbols for /usr/lib/libwrap.so.5 > > Reading symbols from /libexec/ld-elf.so.1...done. > Loaded symbols for /libexec/ld-elf.so.1 > 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at > /usr/src/lib/libc/stdio/fgets.c:56 > 56 { > (gdb) bt > #0 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at > /usr/src/lib/libc/stdio/fgets.c:56 > #1 0x281124ee in xgets (ptr=0xbfbfe67b "", len=1, fp=0x283b8040) at > /usr/src/lib/libwrap/../../contrib/tcp_wrappers/misc.c:38 > #2 0x28111410 in table_match (table=0x28112c5c "/etc/hosts.allow", > request=0xbfbfeb14) > at > /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:162 > #3 0x28111540 in hosts_access (request=0xbfbfeb14) at > /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:132 > #4 0x08052b39 in main (ac=2, av=0xbfbfeecc) at > /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1843 > (gdb) bt > #0 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at > /usr/src/lib/libc/stdio/fgets.c:56 > #1 0x281124ee in xgets (ptr=0xbfbfe67b "", len=1, fp=0x283b8040) at > /usr/src/lib/libwrap/../../contrib/tcp_wrappers/misc.c:38 > #2 0x28111410 in table_match (table=0x28112c5c "/etc/hosts.allow", > request=0xbfbfeb14) > at > /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:162 > #3 0x28111540 in hosts_access (request=0xbfbfeb14) at > /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:132 > #4 0x08052b39 in main (ac=2, av=0xbfbfeecc) at > /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1843 > (gdb) q > The program is running. Quit anyway (and detach it)? (y or n) y > Detaching from program: /usr/sbin/sshd, process 691 > > > A few questions > 1. Is this a known issue of any sort? I've done some searching on it, but > haven't found anything of interest. > 2. Should this be reported to FreeBSD bug tracker, or to libwrap (or both)? > Basically, is FreeBSD's libwrap (more or less) in sync with the main one, or > is it completely separate? When given an input line of more than 2k bytes, libwrap ends up in an infinite loop in xgets(), calling fgets() with a read length of one. As fgets() reads the length minus one characters, it will keep "reading" and returning zero length strings. Thus your server processes will remain stuck until aborted. This Q&D patch makes libwrap behave as documented in hosts_access(5): --- misc.c.orig 2009-03-15 14:06:11.000000000 -0700 +++ misc.c 2009-03-15 14:06:49.000000000 -0700 @@ -48,6 +48,8 @@ ptr += got; len -= got; ptr[0] = 0; + if (len <= 1) + return start; } return (ptr > start ? start : 0); } The documented behavior is: "An error is reported when ... when the length of an access control rule exceeds the capacity of an internal buffer; ..." This is only sligtly better, as the code will now try to parse the remainder of the line as a rule, and either fail or, due to some syntactic quirk, get a false match. From a security standpoint, both are bad. I don't think you'll get a false "allow" match in your case, but unless you have a default "deny" rule somewhere at the end, access may be granted when it shouldn't. Please do file a FreeBSD bug. Is there even an upstream maintainer of tcp wrappers? A quick search seems to indicate that it is more or less abandoned, albeit adopted by several projects. The immediate workarounds I can think of for you are: - Somehow teach sshguard to write rules on multiple lines, each shorter than 2k. Splitting lines using backslashes will not help, as xgets() is concatenating continued lines into a single buffer (the one that is too small) anyway. - Apply the patch above, change the definition of BUFLEN in tcpdchk.c and hosts_access.c to a "sufficiently large" value and rebuild libwrap. Of course, there is no "sufficiently large" value; with the current libwrap code, you'll always run the risk of lines being too long. The real fix involves rewriting chunks of the libwrap code, or finding a version where someone has already done so. $.02, /Mikko From neldredge at math.ucsd.edu Sun Mar 15 15:11:13 2009 From: neldredge at math.ucsd.edu (Nate Eldredge) Date: Sun Mar 15 15:11:20 2009 Subject: Bug in tcp wrappers? In-Reply-To: <20090315144440.N24160@antec.home> References: <49BA9E63.3040000@smartt.com> <20090315144440.N24160@antec.home> Message-ID: On Sun, 15 Mar 2009, Mikko Ty?l?j?rvi wrote: > The real fix involves rewriting chunks of the libwrap code, or finding > a version where someone has already done so. It doesn't seem like it should be too bad. xgets is only called in three places. It would be easy enough to replace it with something like glibc's getline(3), that uses realloc to size a buffer appropriately. If nobody else feels like doing this, maybe I will. -- Nate Eldredge neldredge@math.ucsd.edu From mbsd at pacbell.net Sun Mar 15 15:22:40 2009 From: mbsd at pacbell.net (=?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?=) Date: Sun Mar 15 15:22:48 2009 Subject: Bug in tcp wrappers? In-Reply-To: References: <49BA9E63.3040000@smartt.com> <20090315144440.N24160@antec.home> Message-ID: <20090315151836.K24160@antec.home> On Sun, 15 Mar 2009, Nate Eldredge wrote: > On Sun, 15 Mar 2009, Mikko Ty?l?j?rvi wrote: > >> The real fix involves rewriting chunks of the libwrap code, or finding >> a version where someone has already done so. > > It doesn't seem like it should be too bad. xgets is only called in three > places. It would be easy enough to replace it with something like glibc's > getline(3), that uses realloc to size a buffer appropriately. Yes, it should be pretty straightforward. I just noticed that openbsd applied a (better) variant of my patch for the infinite loop problem in 2003. They didn't address the "line too long" problem, though. > If nobody else feels like doing this, maybe I will. And if you don't, I just might :) /Mikko From ken at mthelicon.com Sun Mar 15 15:40:02 2009 From: ken at mthelicon.com (Pegasus Mc Cleaft) Date: Sun Mar 15 15:40:14 2009 Subject: ETA for ZFS v. 13 Merge From HEAD ? In-Reply-To: <78cb3d3f0903151209r46837d70m914a23e30a19060e@mail.gmail.com> References: <78cb3d3f0903151209r46837d70m914a23e30a19060e@mail.gmail.com> Message-ID: <4AE4493D5E9141E8812E4BC83FB5A2A5@PegaPegII> Hi Adrian, I am not sure, but I didnt think ZFS 13 was ever going to be merged into 7-stable. I thought the kernel memory requirements were to great (just going back in my memory on that one). Also, I think there are still a few bugs left with the zil being enabled (and/or prefetch) causing lockups on machine with a lot of IO. I know I have hit that bug a few times on my machine when using various torrent clients when they want to preallocate large amounts of diskspace. I personally cant wait until a later version of ZFS is imported that supports encryption. I can finally say good-bye to our GEOM ELI USB drives for backups!! Never the less, I am quite thankfull to thoes involved in porting V13 to FreeBSD. Its a wonderfull improvement and my FS of choice when installing on new machines (especially zfs boot) Best regards, Peg ----- Original Message ----- From: "Adrian Penisoara" To: "Pawel Jakub Dawidek" Cc: ; Sent: Sunday, March 15, 2009 7:09 PM Subject: ETA for ZFS v. 13 Merge From HEAD ? > Hi Pawel, > Coming back to the subject, when do you think we might have a merge of > r185029 (import of ZFS version 13) from head back into -stable ? > > Is there anything we can help with to speed up the process (e.g. testing) > ? > > PS: ZFS-FUSE on Linux has also reached v 13... > > Thank you, > Adrian Penisoara > ROFUG / EnterpriseBSD > > --------------------------- > Date: Wed, 26 Nov 2008 10:52:41 +0100 > From: Pawel Jakub Dawidek > Subject: Re: svn commit: r185029 - in head: > cddl/compat/opensolaris/include cddl/compat/opensolaris/misc > cddl/contrib/opensolaris/cmd/zdb > cddl/contrib/opensolaris/cmd/zfs > cddl/contrib/opensolaris/cmd/zinject cd... > To: Attila Nagy > Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, > src-committers@freebsd.org > Message-ID: <20081126095241.GA3188@garage.freebsd.pl> > Content-Type: text/plain; charset="us-ascii" > > On Wed, Nov 26, 2008 at 10:15:58AM +0100, Attila Nagy wrote: >> Hello, >> >> Pawel Jakub Dawidek wrote: >> >Author: pjd >> >Date: Mon Nov 17 20:49:29 2008 >> >New Revision: 185029 >> >URL: http://svn.freebsd.org/changeset/base/185029 >> > >> >Log: >> > Update ZFS from version 6 to 13 and bring some FreeBSD-specific > changes. >> > >> This, and other changes stabilized ZFS by a great level in HEAD. >> Do you plan to MFC these to 7-STABLE? > > Yes, but ETA yet. > > -- > Pawel Jakub Dawidek http://www.wheel.pl > pjd@FreeBSD.org http://www.FreeBSD.org > FreeBSD committer Am I Evil? Yes, I Am! > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From ciphwn at gmail.com Mon Mar 16 07:46:14 2009 From: ciphwn at gmail.com (Cipta H) Date: Mon Mar 16 07:46:21 2009 Subject: writing libnetstat for Summer of Code 2009 Message-ID: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> Hello everyone, I'm a college student studying CS in Columbia University. I'm interested in doing this project for Summer of Code 2009: > Libprocstat and libnetstat > Suggested Summer of Code 2009 project idea > > Technical contact: Robert Watson > > Create, similar to libmemstat, wrapper libraries to support monitoring and management applications to avoid direct use of kvm. Three parts to the project: for each of the above, add kernel support to export data in a less ABI-sensitive way using sysctl, write a library to present the information in an extensible way to applications, and update applications to use the library instead of reaching directly into kernel memory / consuming sysctls. The goal is to allow the kernel implementation to change without breaking applications and requiring them to be recompiled, and to allow monitoring functions to be extended without breaking applications. This should also facilitate writing new classes of monitoring and profiling tools. I'm going to focus mainly on netstat, however. Aside from that, I have a few questions: 1. Aside from the bug report, has there been any other discussion on this issue? I can't seem to find any in the mailing lists. 2. How much experience in C do you need to do this project? Do you need to know the FreeBSD kernel? Thanks in advance, Cipta Herwana From rpaulo at freebsd.org Mon Mar 16 11:06:19 2009 From: rpaulo at freebsd.org (Rui Paulo) Date: Mon Mar 16 11:06:51 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> Message-ID: <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> On 16 Mar 2009, at 14:16, Cipta H wrote: > 2. How much experience in C do you need to do this project? Do you > need to know the FreeBSD kernel? Yes, you need to understand the C programming language well and to be able to learn how the FreeBSD kernel works. You also need to figure out a way to structure the data. I know that XML was proposed in the past, but I don't know if this is the case. -- Rui Paulo -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090316/96285a6b/PGP.pgp From zbeeble at gmail.com Mon Mar 16 11:09:08 2009 From: zbeeble at gmail.com (Zaphod Beeblebrox) Date: Mon Mar 16 11:09:21 2009 Subject: ETA for ZFS v. 13 Merge From HEAD ? In-Reply-To: <4AE4493D5E9141E8812E4BC83FB5A2A5@PegaPegII> References: <78cb3d3f0903151209r46837d70m914a23e30a19060e@mail.gmail.com> <4AE4493D5E9141E8812E4BC83FB5A2A5@PegaPegII> Message-ID: <5f67a8c40903161109le12b8afuc25b8c1ec1b6f70c@mail.gmail.com> On Sun, Mar 15, 2009 at 6:39 PM, Pegasus Mc Cleaft wrote: > Hi Adrian, > > I am not sure, but I didnt think ZFS 13 was ever going to be merged into > 7-stable. I thought the kernel memory requirements were to great (just going > back in my memory on that one). Also, I think there are still a few bugs > left with the zil being enabled (and/or prefetch) causing lockups on machine > with a lot of IO. I know I have hit that bug a few times on my machine when > using various torrent clients when they want to preallocate large amounts of > diskspace. > > I personally cant wait until a later version of ZFS is imported that > supports encryption. I can finally say good-bye to our GEOM ELI USB drives > for backups!! Never the less, I am quite thankfull to thoes involved in > porting V13 to FreeBSD. Its a wonderfull improvement and my FS of choice > when installing on new machines (especially zfs boot) I think that you're touching on two entirely separate points here... What it takes to upgrade ZFS in -STABLE and what it takes to bring ZFS modules in to FreeBSD. I sincerely hope that ZFSv13 is planned for -STABLE. Last we left this issue, testing and a few kernel improvements were in the way. None of the kernel improvements were going to change the API, so the project was doable in -STABLE. That said, time marches on, 8.0-RELEASE draws ever nearer. When we were still several years out on 8.0 and ZFS was causing me more problems, I was much more keen to push for the port. I would still welcome it with open arms, but I'm not convinced that anyone is going to push it forward. The issue of encryption (along with many other issues) is tied to the ability of FreeBSD to compile and use ZFS modules. Just like netgraph modules extend the function of netgraph.ko and geom modules extend the base geom function, ZFS is designed (in Solaris, at least) to take modules. ZFS encryption is a module. I'm not clear on compression --- it would make sense that it is a module, but it seemingly got copied into FreeBSD as a core feature (and it may also be so in solaris). Anyways... is there any plans to allow for ZFS modules in FreeBSD? From bsd.quest at googlemail.com Mon Mar 16 11:39:05 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Mon Mar 16 11:39:11 2009 Subject: rebuilding libpcap Message-ID: <671bb5fc0903161139y2b039a14h1ab33cf1fe369e4@mail.gmail.com> Hello, how to correctly rebuild only libpcap from /usr/src/contrib without rebuilding the whole world ? I try to do in libpcap some changes, then make; make install in /usr/src/contrib/libpcap, but the changes are not visible by calling changed functions :( What I do wrong ? Thanks, Alexej P.S: % uname -v FreeBSD 7.0-RELEASE-p10 #1: Mon Mar 16 16:58:38 CET 2009 From ciphwn at gmail.com Mon Mar 16 11:41:17 2009 From: ciphwn at gmail.com (Cipta H) Date: Mon Mar 16 11:41:24 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> Message-ID: <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> XML? I was thinking of some opaque C structures that the functions write data to, and then supply some accessor methods, just like the ones in libmemstat. Or are you thinking of a different XML? Cipta On Mon, Mar 16, 2009 at 1:34 PM, Rui Paulo wrote: > On 16 Mar 2009, at 14:16, Cipta H wrote: >> >> 2. How much experience in C do you need to do this project? Do you >> need to know the FreeBSD kernel? > > Yes, you need to understand the C programming language well and to be able > to learn how the FreeBSD kernel works. You also need to figure out a way to > structure the data. I know that ?XML was proposed in the past, but I don't > know if this is the case. > > -- > Rui Paulo > > From dnelson at allantgroup.com Mon Mar 16 11:57:54 2009 From: dnelson at allantgroup.com (Dan Nelson) Date: Mon Mar 16 11:58:01 2009 Subject: rebuilding libpcap In-Reply-To: <671bb5fc0903161139y2b039a14h1ab33cf1fe369e4@mail.gmail.com> References: <671bb5fc0903161139y2b039a14h1ab33cf1fe369e4@mail.gmail.com> Message-ID: <20090316185750.GI24875@dan.emsphone.com> In the last episode (Mar 16), Alexej Sokolov said: > how to correctly rebuild only libpcap from /usr/src/contrib without > rebuilding the whole world ? I try to do in libpcap some changes, then > make; make install in > /usr/src/contrib/libpcap, > but the changes are not visible by calling changed functions :( > What I do wrong ? /usr/src/contrib is a repository of 3rd-party source trees, and they're not meant to be built from. Try running your "make ; make install" in /usr/src/lib/libpcap instead. -- Dan Nelson dnelson@allantgroup.com From bsd.quest at googlemail.com Mon Mar 16 11:59:55 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Mon Mar 16 12:00:03 2009 Subject: rebuilding libpcap In-Reply-To: <20090316185750.GI24875@dan.emsphone.com> References: <671bb5fc0903161139y2b039a14h1ab33cf1fe369e4@mail.gmail.com> <20090316185750.GI24875@dan.emsphone.com> Message-ID: <671bb5fc0903161159t2b20ce3fy77d9f282cc1df78d@mail.gmail.com> Ohhh... thanks a lot ! I'am jaust about to do it... 2009/3/16 Dan Nelson > In the last episode (Mar 16), Alexej Sokolov said: > > how to correctly rebuild only libpcap from /usr/src/contrib without > > rebuilding the whole world ? I try to do in libpcap some changes, then > > make; make install in > > /usr/src/contrib/libpcap, > > but the changes are not visible by calling changed functions :( > > What I do wrong ? > > /usr/src/contrib is a repository of 3rd-party source trees, and they're not > meant to be built from. Try running your "make ; make install" in > /usr/src/lib/libpcap instead. > > -- > Dan Nelson > dnelson@allantgroup.com > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From delphij at delphij.net Mon Mar 16 12:04:51 2009 From: delphij at delphij.net (Xin LI) Date: Mon Mar 16 12:04:59 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> Message-ID: <49BEA2BC.6000405@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Cipta, Cipta H wrote: > XML? I was thinking of some opaque C structures that the functions write > data to, and then supply some accessor methods, just like the ones in > libmemstat. Or are you thinking of a different XML? I'm not very sure but I think Rui is referring XML like the GEOM subsystem has used (perhaps to have the kernel expose the statistics data with XML and the userland part of the library parse and return the result)? > On Mon, Mar 16, 2009 at 1:34 PM, Rui Paulo wrote: >> On 16 Mar 2009, at 14:16, Cipta H wrote: >>> 2. How much experience in C do you need to do this project? Do you >>> need to know the FreeBSD kernel? >> Yes, you need to understand the C programming language well and to be able >> to learn how the FreeBSD kernel works. You also need to figure out a way to >> structure the data. I know that XML was proposed in the past, but I don't >> know if this is the case. >> >> -- >> Rui Paulo >> >> > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAkm+orwACgkQi+vbBBjt66DyAACfZYT9/IbaPkUViBqDV6whxi2L N/8An0av6fp/EahIw5aUmd01lfNEo4el =t1WB -----END PGP SIGNATURE----- From chris at smartt.com Mon Mar 16 12:11:20 2009 From: chris at smartt.com (Chris St Denis) Date: Mon Mar 16 12:11:27 2009 Subject: Bug in tcp wrappers? In-Reply-To: <20090315144440.N24160@antec.home> References: <49BA9E63.3040000@smartt.com> <20090315144440.N24160@antec.home> Message-ID: <49BEA45A.5060603@smartt.com> Mikko Ty?l?j?rvi wrote: > > Hi Chris, > > On Fri, 13 Mar 2009, Chris St Denis wrote: > >> I think I've found a bug in libwrap/tcpwrappers. > > I think so too :) See below. > >> Before filing an actual bug report I want to get some feedback here >> first. >> >> A hosts.allow file with ~1000 ips on a single line (Haven't >> experimented with >> other quantities yet), causes network daemons that use libwrap stop >> accepting >> incoming network connections and use 100% cpu on an incoming connection. >> This problem appeared because sshguard placed a large number of IPs >> in my >> hosts.allow file triggering this bug. >> >> I've left the affected daemons for a long period of time (once about >> 8 hours) >> and they don't seem to come back, so I think this is more than just >> it taking >> a while to loop through a 1000 item array of IPs >> >> >> The production system that was affected is FreeBSD 7.0-32bit >> Test system is FreeBSD 7.1-32bit >> >> Example hosts.allow file (IPs are randomly generated for purposes of >> example) >> >> sshd : 112.110.123.63 113.11.2.126 113.11.8.6 113.19.19.22 >> 113.197.48.68 116.48.108.244 116.48.11.19 : deny >> ALL : ALL : allow >> >> top output of affected system. sshd wcpu slowly crawls up to 100% >> over about >> 30 seconds or so. >> >> crash# top >> last pid: 692; load averages: 0.08, 0.04, 0.04 >> up >> 0+00:12:13 15:42:30 >> 24 processes: 2 running, 22 sleeping >> CPU: 49.7% user, 0.0% nice, 0.2% system, 0.2% interrupt, 49.9% idle >> Mem: 9304K Active, 6004K Inact, 21M Wired, 32K Cache, 10M Buf, 947M >> Free >> Swap: 1995M Total, 1995M Free >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU >> COMMAND >> 691 root 1 103 0 5760K 3660K CPU1 1 0:04 33.98% sshd >> 672 root 1 4 0 8436K 3888K sbwait 1 0:00 0.00% sshd >> 677 cstdenis 1 20 0 4460K 2288K pause 0 0:00 0.00% csh >> 682 root 1 20 0 5484K 2632K pause 0 0:00 0.00% csh >> 675 cstdenis 1 44 0 8436K 3896K select 0 0:00 0.00% sshd >> >> >> A backtrace shows >> >> crash# gdb /usr/sbin/sshd 691 >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and >> you are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for >> details. >> This GDB was configured as "i386-marcel-freebsd"... >> Attaching to program: /usr/sbin/sshd, process 691 >> Reading symbols from /usr/lib/libssh.so.4...done. >> Loaded symbols for /usr/lib/libssh.so.4 >> Reading symbols from /lib/libutil.so.7...done. >> Loaded symbols for /lib/libutil.so.7 >> Reading symbols from /lib/libz.so.4...done. >> Loaded symbols for /lib/libz.so.4 >> Reading symbols from /usr/lib/libwrap.so.5...done. >> Loaded symbols for /usr/lib/libwrap.so.5 >> >> Reading symbols from /libexec/ld-elf.so.1...done. >> Loaded symbols for /libexec/ld-elf.so.1 >> 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at >> /usr/src/lib/libc/stdio/fgets.c:56 >> 56 { >> (gdb) bt >> #0 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at >> /usr/src/lib/libc/stdio/fgets.c:56 >> #1 0x281124ee in xgets (ptr=0xbfbfe67b "", len=1, fp=0x283b8040) at >> /usr/src/lib/libwrap/../../contrib/tcp_wrappers/misc.c:38 >> #2 0x28111410 in table_match (table=0x28112c5c "/etc/hosts.allow", >> request=0xbfbfeb14) >> at >> /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:162 >> #3 0x28111540 in hosts_access (request=0xbfbfeb14) at >> /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:132 >> #4 0x08052b39 in main (ac=2, av=0xbfbfeecc) at >> /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1843 >> (gdb) bt >> #0 0x28373225 in fgets (buf=0xbfbfe67b "", n=1, fp=0x283b8040) at >> /usr/src/lib/libc/stdio/fgets.c:56 >> #1 0x281124ee in xgets (ptr=0xbfbfe67b "", len=1, fp=0x283b8040) at >> /usr/src/lib/libwrap/../../contrib/tcp_wrappers/misc.c:38 >> #2 0x28111410 in table_match (table=0x28112c5c "/etc/hosts.allow", >> request=0xbfbfeb14) >> at >> /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:162 >> #3 0x28111540 in hosts_access (request=0xbfbfeb14) at >> /usr/src/lib/libwrap/../../contrib/tcp_wrappers/hosts_access.c:132 >> #4 0x08052b39 in main (ac=2, av=0xbfbfeecc) at >> /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:1843 >> (gdb) q >> The program is running. Quit anyway (and detach it)? (y or n) y >> Detaching from program: /usr/sbin/sshd, process 691 >> >> >> A few questions >> 1. Is this a known issue of any sort? I've done some searching on it, >> but >> haven't found anything of interest. >> 2. Should this be reported to FreeBSD bug tracker, or to libwrap (or >> both)? >> Basically, is FreeBSD's libwrap (more or less) in sync with the main >> one, or >> is it completely separate? > > When given an input line of more than 2k bytes, libwrap ends up in an > infinite loop in xgets(), calling fgets() with a read length of one. > As fgets() reads the length minus one characters, it will keep > "reading" and returning zero length strings. > > Thus your server processes will remain stuck until aborted. > > This Q&D patch makes libwrap behave as documented in hosts_access(5): > > --- misc.c.orig 2009-03-15 14:06:11.000000000 -0700 > +++ misc.c 2009-03-15 14:06:49.000000000 -0700 > @@ -48,6 +48,8 @@ > ptr += got; > len -= got; > ptr[0] = 0; > + if (len <= 1) > + return start; > } > return (ptr > start ? start : 0); > } > > > The documented behavior is: > > "An error is reported when ... when the length of an access control > rule exceeds the capacity of an internal buffer; ..." > > This is only sligtly better, as the code will now try to parse the > remainder of the line as a rule, and either fail or, due to some > syntactic quirk, get a false match. From a security standpoint, both > are bad. > > I don't think you'll get a false "allow" match in your case, but unless > you have a default "deny" rule somewhere at the end, access may be > granted when it shouldn't. > > Please do file a FreeBSD bug. Is there even an upstream maintainer of > tcp wrappers? A quick search seems to indicate that it is more or > less abandoned, albeit adopted by several projects. > > The immediate workarounds I can think of for you are: > > - Somehow teach sshguard to write rules on multiple lines, each > shorter than 2k. Splitting lines using backslashes will not help, > as xgets() is concatenating continued lines into a single buffer > (the one that is too small) anyway. > > - Apply the patch above, change the definition of BUFLEN in tcpdchk.c > and hosts_access.c to a "sufficiently large" value and rebuild > libwrap. Of course, there is no "sufficiently large" value; with > the current libwrap code, you'll always run the risk of lines being > too long. > > The real fix involves rewriting chunks of the libwrap code, or finding > a version where someone has already done so. > > $.02, > /Mikko Thanks. I have created PR 132705. http://www.freebsd.org/cgi/query-pr.cgi?pr=132705 My immediate workaround was even simpler than that. I just turned off sshGuard. It's just there to provide an additional level of security which isn't really needed. I may put it back using one of the firewall modules instead of hosts.allow in the future. I guess it was just never designed for the kind of distributed brute force ssh and ftp attacks that have been occurring more in the last several months. From ciphwn at gmail.com Mon Mar 16 12:14:01 2009 From: ciphwn at gmail.com (Cipta H) Date: Mon Mar 16 12:14:09 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <49BEA2BC.6000405@delphij.net> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> <49BEA2BC.6000405@delphij.net> Message-ID: <13b997e60903161213t320252dbg56e96335e79f7eb9@mail.gmail.com> Thanks for the reply, Xin. I'm aware of something called sysctl, and if I am accepted to work on this project, my main task is to ensure all live network data will come from sysctl, but the only XML I know of is the markup language. Perhaps someone more knowledgeable can point me to the right resource? Thanks in advance. Cipta On Mon, Mar 16, 2009 at 3:04 PM, Xin LI wrote: > I'm not very sure but I think Rui is referring XML like the GEOM > subsystem has used (perhaps to have the kernel expose the statistics > data with XML and the userland part of the library parse and return the > result)? From rpaulo at freebsd.org Mon Mar 16 12:26:36 2009 From: rpaulo at freebsd.org (Rui Paulo) Date: Mon Mar 16 12:26:43 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <49BEA2BC.6000405@delphij.net> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> <49BEA2BC.6000405@delphij.net> Message-ID: <08007F9A-E6FB-4DEE-AB4A-84D3991561D5@freebsd.org> On 16 Mar 2009, at 19:04, Xin LI wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, Cipta, > > Cipta H wrote: >> XML? I was thinking of some opaque C structures that the functions >> write >> data to, and then supply some accessor methods, just like the ones in >> libmemstat. Or are you thinking of a different XML? > > I'm not very sure but I think Rui is referring XML like the GEOM > subsystem has used (perhaps to have the kernel expose the statistics > data with XML and the userland part of the library parse and return > the > result)? That's it. Of course, Robert should now more about this than I do and since he mentioned libmemstat, opaque C structs are probably what he was thinking. -- Rui Paulo -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090316/49a689aa/PGP.pgp From delphij at delphij.net Mon Mar 16 14:32:09 2009 From: delphij at delphij.net (Xin LI) Date: Mon Mar 16 14:32:15 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <13b997e60903161213t320252dbg56e96335e79f7eb9@mail.gmail.com> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> <49BEA2BC.6000405@delphij.net> <13b997e60903161213t320252dbg56e96335e79f7eb9@mail.gmail.com> Message-ID: <49BEC548.90309@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Cipta, Cipta H wrote: > Thanks for the reply, Xin. I'm aware of something called sysctl, and if > I am accepted to work on this project, my main task is to ensure all live > network data will come from sysctl, but the only XML I know of is the > markup language. Perhaps someone more knowledgeable can point me > to the right resource? Thanks in advance. Yes it's the markup language. I think whether or not to use XML really depends on whether you want structured data. The current approach we have used is to use kvm(3) and obtain the data directly based on knowledge of in-kernel data structure. By using XML, the structured data can be represented in a self-explaining form and known data can be easily extracted from it (of course you will need to design a schema for the data but that's fairly easy once you know what you are willing to expose). Note that you may want to contact Robert to better understand the problem that the libnetstat and friends is targeted to solve. XML is one possible approach (and we have a built-in XML parser library that can be used by userland programs) but it's not the only possible approach :) > Cipta > > On Mon, Mar 16, 2009 at 3:04 PM, Xin LI wrote: >> I'm not very sure but I think Rui is referring XML like the GEOM >> subsystem has used (perhaps to have the kernel expose the statistics >> data with XML and the userland part of the library parse and return the >> result)? Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAkm+xUcACgkQi+vbBBjt66AGRwCgpN1jErbevmhllKqlQgYxuWZt 07AAn1iycaHQCrC74h/RHkokFyBdD9RD =QUDy -----END PGP SIGNATURE----- From ciphwn at gmail.com Mon Mar 16 15:11:36 2009 From: ciphwn at gmail.com (Cipta H) Date: Mon Mar 16 15:11:43 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <49BEC656.50702@freebsd.org> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> <49BEA2BC.6000405@delphij.net> <13b997e60903161213t320252dbg56e96335e79f7eb9@mail.gmail.com> <49BEC656.50702@freebsd.org> Message-ID: <13b997e60903161511p70ee2252qf6d594dae13e4ec@mail.gmail.com> On Mon, Mar 16, 2009 at 5:36 PM, Tim Kientzle wrote: > Many people consider top-posting to be rude. ?FYI. > My comments below, where they belong. > > Cipta H wrote: >> >> Thanks for the reply, Xin. I'm aware of something called sysctl, and if >> I am accepted to work on this project, my main task is to ensure all live >> network data will come from sysctl, but the only XML I know of is the >> markup language. Perhaps someone more knowledgeable can point me >> to the right resource? Thanks in advance. >> >> Cipta >> >> On Mon, Mar 16, 2009 at 3:04 PM, Xin LI wrote: >> >>> I'm not very sure but I think Rui is referring XML like the GEOM >>> subsystem has used (perhaps to have the kernel expose the statistics >>> data with XML and the userland part of the library parse and return the >>> result)? > > There are two different issues: > ?* Kernel <-> userland communications > ?* library <-> client program communications > > There is ample precedent for the former to use sysctl > interfaces that return XML from the kernel that is then > parsed in userland. ?In particular, this makes it much > easier to extend in the future, as long as the proposed > libnetstat library ignores data it doesn't understand. > (In the past, many tools parsed in-kernel data > structures to obtain this kind of information, which is > prone to breakage whenever the kernel changes. ?Making > this so that kernel and tools can evolve more independently > is a major goal here.) > > For the latter, some kind of opaque C structure > makes sense, since that simplifies the client programs. > > So really this breaks down into two very different > tasks: > ?* Designing and implementing a sysctl that returns > ? network statistics as an XML blob > ?* Designing and implementing a C library that knows > ? how to fetch the XML blob, parse it, and return > ? data to client programs. > > Does this make more sense now? > > Tim > Yes, it does, Tim, thank you so much. I'll be sure to look into sysctl.h and study its ability to return XML. I will also contact Robert about this project once I finish gathering more info. Thank you all for answering my questions. Cipta P.S. Sorry about top-posting. I'll be sure to remember it from now on. From kientzle at freebsd.org Mon Mar 16 15:13:49 2009 From: kientzle at freebsd.org (Tim Kientzle) Date: Mon Mar 16 15:13:56 2009 Subject: writing libnetstat for Summer of Code 2009 In-Reply-To: <13b997e60903161213t320252dbg56e96335e79f7eb9@mail.gmail.com> References: <13b997e60903160716x21881adfma588c32551c36e6f@mail.gmail.com> <21C1FF9D-4CDA-4476-9F11-3DE281279C1A@freebsd.org> <13b997e60903161141j8faaf7frd6ce9b1423b40164@mail.gmail.com> <49BEA2BC.6000405@delphij.net> <13b997e60903161213t320252dbg56e96335e79f7eb9@mail.gmail.com> Message-ID: <49BEC656.50702@freebsd.org> Many people consider top-posting to be rude. FYI. My comments below, where they belong. Cipta H wrote: > Thanks for the reply, Xin. I'm aware of something called sysctl, and if > I am accepted to work on this project, my main task is to ensure all live > network data will come from sysctl, but the only XML I know of is the > markup language. Perhaps someone more knowledgeable can point me > to the right resource? Thanks in advance. > > Cipta > > On Mon, Mar 16, 2009 at 3:04 PM, Xin LI wrote: > >>I'm not very sure but I think Rui is referring XML like the GEOM >>subsystem has used (perhaps to have the kernel expose the statistics >>data with XML and the userland part of the library parse and return the >>result)? There are two different issues: * Kernel <-> userland communications * library <-> client program communications There is ample precedent for the former to use sysctl interfaces that return XML from the kernel that is then parsed in userland. In particular, this makes it much easier to extend in the future, as long as the proposed libnetstat library ignores data it doesn't understand. (In the past, many tools parsed in-kernel data structures to obtain this kind of information, which is prone to breakage whenever the kernel changes. Making this so that kernel and tools can evolve more independently is a major goal here.) For the latter, some kind of opaque C structure makes sense, since that simplifies the client programs. So really this breaks down into two very different tasks: * Designing and implementing a sysctl that returns network statistics as an XML blob * Designing and implementing a C library that knows how to fetch the XML blob, parse it, and return data to client programs. Does this make more sense now? Tim From srikanthhcu05 at gmail.com Tue Mar 17 02:29:26 2009 From: srikanthhcu05 at gmail.com (srikanth jampala) Date: Tue Mar 17 02:29:33 2009 Subject: SA add notification to externa module Message-ID: Hi all This is my first posting. I want the notifications about the SA (security association) add/delete events, from the kernel to my externel kernel module. How can I do this... ? Thanks in advance for ur suggestions. Srikanth. From rwatson at FreeBSD.org Tue Mar 17 05:20:48 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Mar 17 05:20:54 2009 Subject: SA add notification to externa module In-Reply-To: References: Message-ID: On Tue, 17 Mar 2009, srikanth jampala wrote: > This is my first posting. > > I want the notifications about the SA (security association) add/delete > events, from the kernel to my externel kernel module. > > How can I do this... ? > > Thanks in advance for ur suggestions. I'm not sure if PF_KEY has an async notification event, but in principle you could consume those inside the kernel, not just in a user application. Alternatively, you might reasonably submit a patch to add an EVENTHANDLER(9) event at the right points in the kernel code so that future versions of FreeBSD will allow your code to plug in more easily. We already provide event handler hooks for things like process fork/exit, arrival/departure of network interfaces, etc. The trick is to place them at the right points so that appropriate locks are held, and you'll want to avoid having your handler code change the semantics of the calling site (i.e., don't sleep if that's not allowed). Robert N M Watson Computer Laboratory University of Cambridge From gemochka at gmail.com Tue Mar 17 08:40:37 2009 From: gemochka at gmail.com (Gema niskazhu) Date: Tue Mar 17 08:40:43 2009 Subject: Trying to use ptrace under FBSD Message-ID: <84133fac0903170818j422891b2ibd0951fcced3368e@mail.gmail.com> Hi all! First of all sorry for my bad english. I am using Free BSD CURRENT x86_64. I am trying to use ptrace under free bsd simply to test that it works Here is my code: #include #include #include #include #include #include #include #include main() { int pid; int wait_val; long long counter = 1; switch(pid = fork() ) { case 0: ptrace(PT_TRACE_ME, 0, 0); execl("/bin/ls","ls",0); break; default: wait(&wait_val); while(WIFSTOPED(wait_val)) { if (ptrace(PT_STEP, pid, *(caddr_t)1)) break; wait(&wait_val); counter++; } } printf("==%lld\n", counter); } But on compilation i get smth like /usr/include/sys/ptrace.h:90: error: expected specifier-qualifier-list before 'lwpid_t' /usr/include/sys/ptrace.h:158: error: expected declaration specifiers or '...' before 'caddr_t' I've googled a lot but cant understand whats wrong... Any suggestions? Thanks in advance From pluknet at gmail.com Tue Mar 17 08:46:35 2009 From: pluknet at gmail.com (pluknet) Date: Tue Mar 17 08:46:42 2009 Subject: Trying to use ptrace under FBSD In-Reply-To: <84133fac0903170818j422891b2ibd0951fcced3368e@mail.gmail.com> References: <84133fac0903170818j422891b2ibd0951fcced3368e@mail.gmail.com> Message-ID: 2009/3/17 Gema niskazhu : > Hi all! > > First of all sorry for my bad english. > > I am using Free BSD CURRENT x86_64. > > I am trying to ?use ptrace under free bsd > > simply to test that it works > > Here is my code: > > #include > #include > #include > #include > #include There is at least an incorrect include order. sys/types.h is a prerequisite for sys/ptrace.h > #include > #include > #include > > > main() > { > ?int pid; > ?int wait_val; > ?long long counter = 1; > > ?switch(pid = fork() ) > ?{ > ?case 0: > > > ?ptrace(PT_TRACE_ME, 0, 0); > > > > ?execl("/bin/ls","ls",0); > ?break; > > ?default: > > > ?wait(&wait_val); > > ?while(WIFSTOPED(wait_val)) > ?{ > ?if (ptrace(PT_STEP, pid, *(caddr_t)1)) break; > > ?wait(&wait_val); > > ?counter++; > ?} > > > ?} > > ?printf("==%lld\n", counter); > > > } > > But on compilation i get smth like > > /usr/include/sys/ptrace.h:90: error: expected specifier-qualifier-list > before 'lwpid_t' > /usr/include/sys/ptrace.h:158: error: expected declaration specifiers or > '...' before 'caddr_t' > > I've googled a lot but cant understand whats wrong... > > Any suggestions? > > Thanks in advance > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > -- wbr, pluknet From grarpamp at gmail.com Tue Mar 17 23:29:36 2009 From: grarpamp at gmail.com (grarpamp) Date: Wed Mar 18 04:13:33 2009 Subject: ZFS version list [was ETA for ZFS ver: n] Message-ID: ZFS version list [was ETA for ZFS ver: n] I needed raw, bit reliable, stable, encrypted storage. ZFS gave all but the last part so far. None of the features since v6 were useful to me. And as with most software, there are surely tons of fixes and optimizations being handled silently that are useful. Additions at or before v6 that were nifty: compression hot spares raidz2 ditto blocks sha256 - chained back to the uberblock thing Integrated crypto will be very useful, simply to eliminate that GEOM. Even if GBDE and GELI are cool :) Hopefully ZFS will include a strong 256 bit cipher along with other options. My guess is that it will be out from SUN midyear, before FBSD 8.0, and thus a potential for 8.0. The ZFS iSCSI bit might be cool. Putting things like that all under the ZFS hierarchy could be sickly entertaining :) If BSD chflags(2) schg, as on UFS, does or will work on ZFS, that's cool. See the Solaris chmod command. FBSD could very well have magically encrypted user homedirs that make use of some of the inherent ZFS [delegation, etc?] features. login could be hacked as could sshd or possibly pamify things. Haven't really thought about it other than Apple has it. Don't know about other BSD's. It is awesome that FBSD has ZFS! No matter what gets done when, thanks for all the work on it... past, present and on into future. Version list attached for people to reference... -------------- next part -------------- ======================================== http://opensolaris.org/os/community/zfs/version// ======================================== ZFS Pool Version 14 This version includes support for the following feature: * passthrough-x aclinherit property support This feature is available in: * Solaris Express Community Edition, build 103 The related bug and PSARC case for the version 14 change are: * 6765166 Need to provide mechanism to optionally inherit ACE_EXECUTE * PSARC 2008/659 New ZFS "passthrough-x" ACL inheritance rules ======================================== ZFS Pool Version 13 This version includes support for the following features: * usedbysnapshots property * usedbychildren property * usedbyrefreservation property * usedbydataset property These features are available in: * Solaris Express Community Edition, build 98 The related bug and PSARC case for version 13 change is: * 6730799 want snapused property * PSARC 2008/518 ZFS space accounting enhancements ======================================== ZFS Pool Version 12 This version includes support for the following feature: * Properties for Snapshots This feature is available in: * Solaris Express Community Edition, build 96 The related bug for the version 12 change is: * 6701797 want user properties on snapshots ======================================== ZFS Pool Version 11 This version includes support for the following feature: * Improved zpool scrub / resilver performance This feature is available in: * Solaris Express Community Edition, build 94 The related bug for the version 11 change is: * 6343667 scrub/resilver has to start over when a snapshot is taken * (Note, this bug is fixed when using build 94 even with older pool versions. However, upgrading the pool can improve scrub performance when there are many filesystems, snapshots, and clones.) ======================================== ZFS Pool Version 10 This version includes support for the following feature: * Devices can be added to a storage pool as "cache devices." These devices provide an additional layer of caching between main memory and disk. Using cache devices provides the greatest performance improvement for random read-workloads of mostly static content. This feature is available in the Solaris Express Community Edition, build 78. The Solaris 10 10/08 release includes ZFS pool version 10, but support for cache devices is not included in this Solaris release. The related bug for the version 10 change is: * 6536054 second tier ("external") ARC ======================================== ZFS Pool Version 9 This version includes support for the following features: * In addition to the existing ZFS quota and reservation features, this release includes dataset quotas and reservations that do not include descendent datasets, such as snapshots and clones, in the space consumption. ("zfs set refquota" and "zfs set refreservation".) * A reservation is automatically set when a non-sparse ZFS volume is created that matches the size of the volume. This release provides an immediate reservation feature so that you set a reservation on a non-sparse volume with enough space to take snapshots and modify the contents of the volume. * CIFS server support These features are available in Solaris Express Community Edition, build 77. The related bugs for version 9 changes are: * 6431277 want filesystem-only quotas * 6483677 need immediate reservation * 6617183 CIFS Service PSARC 2006/715 ======================================== ZFS Pool Version 8 This version now supports the ability to delegate zfs(1M) administrative tasks to ordinary users. This feature is available in: * Solaris Express Community Edition, build 69 * Solaris 10 10/08 release The related bug for the version 8 change is: * 6349470 investigate non-root restore/backup ======================================== ZFS Pool Version 7 This version includes support for the following feature: The ZFS Intent Log (ZIL) satisfies the need of some applications to know the data they changed is on stable storage on return from a system call. The Intent Log holds records of those system calls and they are replayed if the system power fails or panics if they have not been committed to the main pool. When the Intent Log is allocated from the main pool, it allocates blocks that chain through the pool. This version adds the capability to specify a separate Intent Log device or devices. This feature is available in: * Solaris Express Community Edition, build 68 * Solaris 10 10/08 release The related bug for the version 7 change is: * 6339640 Make ZIL use NVRAM when available. ======================================== ZFS Pool Version 6 This version includes support for the following feature: * 'bootfs' pool property This feature is available in: * Solaris Express Community Edition, build 62 * Solaris 10 10/08 release The related bugs for version 6 changes are as follows: * 4929890 ZFS Boot support for the x86 platform * 6479807 pools need properties ======================================== ZFS Pool Version 5 This version includes support for the following feature: * gzip compression for ZFS datasets This feature is available in: * Solaris Express Community Edition, build 62 * Solaris 10 10/08 release The related bug for the version 5 changes is: * 6536606 gzip compression for ZFS ======================================== ZFS Pool Version 4 This version includes support for the following feature: * zpool history This feature is available in: * Solaris Express Community Edition, build 62 * Solaris 10 8/07 release The related bugs for version 4 changes are as follows: * 6529406 zpool history needs to bump the on-disk version * 6343741 want to store a command history on disk ======================================== ZFS Pool Version 3 This version includes support for the following features: * Hot spares * Double-parity RAID-Z (raidz2) * Improved RAID-Z accounting These features are available in: * Solaris Express Community Edition, build 42 * Solaris 10 11/06 release, (build 3) The related bugs for version 3 changes are as follows: * 6405966 Hot Spare support in ZFS * 6417978 double parity RAID-Z a.k.a. RAID6 * 6288488 du reports misleading size on RAID-Z ======================================== ZFS Pool Version 2 This version includes support for "Ditto Blocks", or replicated metadata. Due to the tree-like structure of the ZFS on-disk format, an uncorrectable error in a leaf block may be relatively benign, while an uncorrectable error in pool metadata can result in an unopenable pool. This feature introduces automatic replication of metadata (up to 3 copies of each block) independent of any underlying pool-wide redundancy. For example, on a pool with a single mirror, the most critical metadata will appear three times on each side of the mirror, for a total of six copies. This ensures that while user data may be lost due to corruption, all data in the pool will be discoverable and the pool will still be usable. This will be expanded in the future to allow user data replication on a per-dataset basis. This feature was integrated on 4/10/06 with the following bug fix: 6410698 ZFS metadata needs to be more highly replicated (ditto blocks) This feature is available in: * Solaris Express Community Edition, build 38 * Solaris 10 10/06 release (build 09) ======================================== ZFS Pool Version 1 This is the initial ZFS on-disk format as integrated on 10/31/05. During the next six months of internal use, there were a few on-disk format changes that did not result in a version number change, but resulted in a flag day since earlier versions could not read the newer changes. The first official releases supporting this version are: * Solaris Express Community Edition, build 36 * Solaris 10 6/06 release Earlier releases may not support this version, despite being formatted with the same on-disk number. This is due to: 6389368 fat zap should use 16k blocks (with backwards compatability) 6390677 version number checking makes upgrades challenging ======================================== From avg at icyb.net.ua Wed Mar 18 08:44:47 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Wed Mar 18 08:44:56 2009 Subject: usb keyboard dying at loader prompt In-Reply-To: <493D37DB.6030902@icyb.net.ua> References: <4912E462.4090608@icyb.net.ua> <491586B9.2020303@vwsoft.com> <4919851B.7050800@icyb.net.ua> <492FF127.807@icyb.net.ua> <20081128134802.GA75900@onelab2.iet.unipi.it> <493D37DB.6030902@icyb.net.ua> Message-ID: <49C116EB.7020409@icyb.net.ua> I would like to report that I am no longer seeing the issue in the subject line. The problem was fixed by the recent commits of jhb ( I tested stable/7). -- Andriy Gapon From aryeh.friedman at gmail.com Wed Mar 18 08:49:23 2009 From: aryeh.friedman at gmail.com (Aryeh M. Friedman) Date: Wed Mar 18 08:49:31 2009 Subject: is gmirror byte or fs level? Message-ID: <49C117FF.5070102@gmail.com> If I have a dual boot system w/ Vista on the first slices and all the FreeBSD filesystems on the second and then run gmirror on the disk will the mirror disk also have the Vista slice? From pieter at degoeje.nl Wed Mar 18 09:05:42 2009 From: pieter at degoeje.nl (Pieter de Goeje) Date: Wed Mar 18 09:05:49 2009 Subject: is gmirror byte or fs level? In-Reply-To: <49C117FF.5070102@gmail.com> References: <49C117FF.5070102@gmail.com> Message-ID: <200903181653.34748.pieter@degoeje.nl> On Wednesday 18 March 2009 16:49:19 Aryeh M. Friedman wrote: > If I have a dual boot system w/ Vista on the first slices and all the > FreeBSD filesystems on the second and then run gmirror on the disk will > the mirror disk also have the Vista slice? Yes, gmirror is block level and has no knowledge whatsoever of the filesystems on top of it. -- Pieter de Goeje From kostikbel at gmail.com Wed Mar 18 10:24:00 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Mar 18 10:24:08 2009 Subject: threaded, forked, rethreaded processes will deadlock In-Reply-To: References: <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> <20090122045637.GA61058@zim.MIT.EDU> Message-ID: <20090318163222.GE7716@deviant.kiev.zoral.com.ua> On Thu, Jan 22, 2009 at 12:42:56AM -0500, Daniel Eischen wrote: > On Wed, 21 Jan 2009, David Schultz wrote: > > >I think there *is* a real bug here, but there's two distinct ways > >to fix it. When a threaded process forks, malloc acquires all its > >locks so that its state is consistent after a fork. However, the > >post-fork hook that's supposed to release these locks fails to do > >so in the child because the child process isn't threaded, and > >malloc_mutex_unlock() is optimized to be a no-op in > >single-threaded processes. If the child *stays* single-threaded, > >malloc() works by accident even with all the locks held because > >malloc_mutex_lock() is also a no-op in single-threaded processes. > >But if the child goes multi-threaded, then things break. > > > >Solution 1 is to actually unlock the locks in the child process, > >which is what Brian is proposing. > > > >Solution 2 is to take the position that all of this pre- and > >post-fork bloat in the fork() path is gratuitous and should be > >removed. The rationale here is that if you fork with multiple > >running threads, there's scads of ways in which the child's heap > >could be inconsistent; fork hooks would be needed not just in > >malloc(), but in stdio, third party libraries, etc. Why should > >malloc() be special? It's the programmer's job to quiesce all the > >threads before calling fork(), and if the programmer doesn't do > >this, then POSIX only guarantees that async-signal-safe functions > >will work. > > > >Note that Solution 2 also fixes Brian's problem if he quiesces all > >of his worker threads before forking (as he should!) With the > >pre-fork hook removed, all the locks will start out free in the > >child. So that's what I vote for... > > The problem is that our own libraries (libthr included) > need to malloc() for themselves, even after a fork() in > the child. After a fork(), the malloc locks should be > reinitialized in the child if it was threaded, so that > our implementation actually works for all the async > signal calls, fork(), exec(), etc. I forget the exact > failure modes for very common cases, but if you remove > the re-initialization of the malloc locks, I'm sure > you will have problems. > > Perhaps much of this malloc() stuff goes away when we > move to pthread locks that are not pointers to allocated > objects, but instead are actual objects/structures. > This needs to be done in order for mutexes/CVs/etc > to be PTHREAD_PROCESS_SHARED (placed in shared memory > and used by multiple processes). In other words, > pthread_mutex_t goes from this: > > typedef struct pthread_mutex *pthread_mutex_t; > > to something like this: > > struct __pthread_mutex { > uint32_t lock; > ... > } > typedef struct __pthread_mutex pthread_mutex_t; > > Same thing for CVs, and we probably should convert any other > locks used internally by libc/libpthread (spinlocks). > > So after a fork(), there is no need to reallocate anything, > it can just be reinitialized if necessary. > I looked at the issue once more recently, and I propose the following much less intrusive patch. It is somewhat hackish, but I think that it would be good to have this working. Most other Unixes do have working thread library after the fork. Any objections ? diff --git a/lib/libthr/thread/thr_fork.c b/lib/libthr/thread/thr_fork.c index bc410d1..ae6b9ad 100644 --- a/lib/libthr/thread/thr_fork.c +++ b/lib/libthr/thread/thr_fork.c @@ -173,14 +173,19 @@ _fork(void) /* Ready to continue, unblock signals. */ _thr_signal_unblock(curthread); - if (unlock_malloc) + if (unlock_malloc) { + __isthreaded = 1; _malloc_postfork(); + __isthreaded = 0; + } /* Run down atfork child handlers. */ TAILQ_FOREACH(af, &_thr_atfork_list, qe) { if (af->child != NULL) af->child(); } + + THR_UMUTEX_UNLOCK(curthread, &_thr_atfork_lock); } else { /* Parent process */ errsave = errno; -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090318/716dc44f/attachment.pgp From deischen at freebsd.org Wed Mar 18 11:12:27 2009 From: deischen at freebsd.org (Daniel Eischen) Date: Wed Mar 18 11:12:34 2009 Subject: threaded, forked, rethreaded processes will deadlock In-Reply-To: <20090318163222.GE7716@deviant.kiev.zoral.com.ua> References: <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> <20090122045637.GA61058@zim.MIT.EDU> <20090318163222.GE7716@deviant.kiev.zoral.com.ua> Message-ID: On Wed, 18 Mar 2009, Kostik Belousov wrote: > > I looked at the issue once more recently, and I propose the following > much less intrusive patch. It is somewhat hackish, but I think that > it would be good to have this working. Most other Unixes do have > working thread library after the fork. Any objections ? No objections. -- DE From aryeh.friedman at gmail.com Wed Mar 18 11:41:22 2009 From: aryeh.friedman at gmail.com (Aryeh M. Friedman) Date: Wed Mar 18 11:41:29 2009 Subject: confusion Message-ID: <49C14048.5070409@gmail.com> I just set up mirroring on my dual boot fb-7.1/vistaX32SP1 machine (the dual boot works fine) but I got this message when I attempted to mount vista via sysutils/fusefs-ntfs: Actual VCN (0x3369700000100) of index buffer is different from expected VCN (0x1). Failed to mount '/dev/mirror/gm0s1': Input/output error NTFS is either inconsistent, or there is a hardware fault, or it's a SoftRAID/FakeRAID hardware. In the first case run chkdsk /f on Windows then reboot into Windows twice. The usage of the /f parameter is very important! If the device is a SoftRAID/FakeRAID then first activate it and mount a different device under the /dev/mapper/ directory, (e.g. /dev/mapper/nvidia_eahaabcc1). Please see the 'dmraid' documentation for more details. I don't have a clue what it means where do I look for more? From kientzle at freebsd.org Wed Mar 18 13:10:59 2009 From: kientzle at freebsd.org (Tim Kientzle) Date: Wed Mar 18 13:11:06 2009 Subject: is gmirror byte or fs level? In-Reply-To: <200903181653.34748.pieter@degoeje.nl> References: <49C117FF.5070102@gmail.com> <200903181653.34748.pieter@degoeje.nl> Message-ID: <49C15547.6030608@freebsd.org> Pieter de Goeje wrote: > On Wednesday 18 March 2009 16:49:19 Aryeh M. Friedman wrote: > >>If I have a dual boot system w/ Vista on the first slices and all the >>FreeBSD filesystems on the second and then run gmirror on the disk will >>the mirror disk also have the Vista slice? > > Yes, gmirror is block level and has no knowledge whatsoever of the filesystems on top of it. But of course, gmirror works by intercepting writes to the disk. Vista does not use gmirror, so writes from Vista will not be mirrored. Only writes from FreeBSD to the Vista slice will get mirrored, which is almost certainly not what you want. Tim From davidxu at freebsd.org Wed Mar 18 20:58:07 2009 From: davidxu at freebsd.org (David Xu) Date: Wed Mar 18 20:58:14 2009 Subject: threaded, forked, rethreaded processes will deadlock In-Reply-To: <20090318163222.GE7716@deviant.kiev.zoral.com.ua> References: <4966F81C.3070406@elischer.org> <20090109163426.GC2825@green.homeunix.org> <49678BBC.8050306@elischer.org> <20090116211959.GA12007@green.homeunix.org> <49710BD6.7040705@FreeBSD.org> <20090120004135.GB12007@green.homeunix.org> <20090121230033.GC12007@green.homeunix.org> <20090122045637.GA61058@zim.MIT.EDU> <20090318163222.GE7716@deviant.kiev.zoral.com.ua> Message-ID: <49C1C356.9090006@freebsd.org> Kostik Belousov wrote: > I looked at the issue once more recently, and I propose the following > much less intrusive patch. It is somewhat hackish, but I think that > it would be good to have this working. Most other Unixes do have > working thread library after the fork. Any objections ? > > diff --git a/lib/libthr/thread/thr_fork.c b/lib/libthr/thread/thr_fork.c > index bc410d1..ae6b9ad 100644 > --- a/lib/libthr/thread/thr_fork.c > +++ b/lib/libthr/thread/thr_fork.c > @@ -173,14 +173,19 @@ _fork(void) > /* Ready to continue, unblock signals. */ > _thr_signal_unblock(curthread); > > - if (unlock_malloc) > + if (unlock_malloc) { > + __isthreaded = 1; > _malloc_postfork(); > + __isthreaded = 0; > + } > > /* Run down atfork child handlers. */ > TAILQ_FOREACH(af, &_thr_atfork_list, qe) { > if (af->child != NULL) > af->child(); > } > + > + THR_UMUTEX_UNLOCK(curthread, &_thr_atfork_lock); ^^^ This line is not needed. > } else { > /* Parent process */ > errsave = errno; From patfbsd at davenulle.org Thu Mar 19 14:16:51 2009 From: patfbsd at davenulle.org (Patrick =?ISO-8859-15?Q?Lamaizi=E8re?=) Date: Thu Mar 19 14:16:58 2009 Subject: cryptosoft(4) not locked ? Message-ID: <20090319221650.4a8274ff@baby-jane.lamaiziere.net> Hello, I'm looking the cryptosoft driver and I notice it is not locked at all. As far I can see it can be used from several contexts. I think it should be locked? Regards. From ota at j.email.ne.jp Fri Mar 20 01:11:23 2009 From: ota at j.email.ne.jp (Yoshihiro Ota) Date: Fri Mar 20 01:11:30 2009 Subject: 2 uni-directional TCP connection good? Message-ID: <20090320045319.04484fc5.ota@j.email.ne.jp> Hi forks. I have question on network programming. It will be nice if some could answer. I saw a program that opens 2 TCP connections. One connection is only used for server to client messaging only and the other connection is used only for client to server messaging. First of all, because TCP is already bi-directional communication, I don't think it is unnecessary to make 2 connection in the first place. After talking to my friend, he said it was very bad to do such things for three reasons. 1. With TCP connections, only sender side can detect some communication issues passively if happened. By using two connections, you lost that ability by your self. I agree on this one. 2. He also said that it would also waste network bandwidth. 3. He also said that it would causes some data flushing/synchronization issues. Indeed, this was what I saw with the program. However, I couldn't understand why it could happen. What I saw was from time to time, the sender side reported it send messages with some sequence numbers but the receiver didn't actually receive these messages for a long time, I think it was about a couple of seconds to several seconds between two hosts on the same switch. Could anyone explain if #2 is true and why #3 happens? Regards, Hiro From mdc at prgmr.com Fri Mar 20 02:25:43 2009 From: mdc at prgmr.com (Michael David Crawford) Date: Fri Mar 20 02:25:49 2009 Subject: 2 uni-directional TCP connection good? In-Reply-To: <20090320045319.04484fc5.ota@j.email.ne.jp> References: <20090320045319.04484fc5.ota@j.email.ne.jp> Message-ID: <49C35A58.2030607@prgmr.com> Yoshihiro Ota wrote: > I saw a program that opens 2 TCP connections. > One connection is only used for server to client messaging only > and the other connection is used only for client to server messaging. > 2. He also said that it would also waste network bandwidth. You have a two-way communication no matter what you do. But if you don't actually use inbound direction, all it gets used for is the receipt of ACK packets. That is, the inbound connection is used to make the data transfer reliable. If you don't have any payload data on the inbound connection, then the outbound connection won't have any ACK packets. If you're sending payload data, the ACK info can "hitchhike" along with the payload packets, thus saving bandwidth. But if you're not sending any payload data at all, there will be packets transmitted which contain the ACKs and nothing else. The extra network overhead will be modest if you're sending a lot of data all at once, say transferring a large file. But if very little data is sent per packet, say individual characters in a telnet connection, the overhead would be very high. If you have a single connection with payload data in both directions, then the ACKs will almost always ride along with some payload data. The only time a packet will contain nothing but an ACK will be when some data was transmitted, but none is to be received at the time. Mike -- Michael David Crawford mdc@prgmr.com prgmr.com - We Don't Assume You Are Stupid. Xen-Powered Virtual Private Servers: http://prgmr.com/xen From rwatson at FreeBSD.org Fri Mar 20 06:24:10 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Mar 20 06:24:17 2009 Subject: 2 uni-directional TCP connection good? In-Reply-To: <20090320045319.04484fc5.ota@j.email.ne.jp> References: <20090320045319.04484fc5.ota@j.email.ne.jp> Message-ID: On Fri, 20 Mar 2009, Yoshihiro Ota wrote: > 1. With TCP connections, only sender side can detect some communication > issues passively if happened. By using two connections, you lost that > ability by your self. I agree on this one. Could you expand a bit on this point? While the connection creation process (usualy) asymetric, once the connection is built it's essentially the same state machine on both sides of the connection, and socket semantics with respect to the state machine are effectively identical. Application on both sides should be able to detect disconnect, monitor connection state using TCP_INFO, etc. Robert N M Watson Computer Laboratory University of Cambridge From alessandro.dev at gmail.com Fri Mar 20 07:09:25 2009 From: alessandro.dev at gmail.com (Alessandro Silveira) Date: Fri Mar 20 07:09:31 2009 Subject: Suppress boot prompt (Dummy question) Message-ID: <720e1f20903200637p6eda022cs8bdfa0f363aaadcf@mail.gmail.com> In previous versions of FreeBSD I removed the boot prompt, setting the variable [autoboot_delay = "-1"] but in version 7.1 does not work. How do I remove the prompt to boot in FreeBSD 7.1 version? Thanks Alessandro From gabriele.modena at gmail.com Sat Mar 21 04:50:32 2009 From: gabriele.modena at gmail.com (Gabriele Modena) Date: Sat Mar 21 04:50:39 2009 Subject: GSoC: Semantic File System Message-ID: <1fe1d5d60903210422g70efef15hdd685695cdf8df3c@mail.gmail.com> Hello, I am an AI master student at the university of Amsterdam. On of my current research interests lays in the area of information retrieval and I would like to do a project within my University research group starting next june. I am actually studying background literature about semantic filesystem and information retrieval over local files. Being also quite interested in kernel development, I would like to propose a proof of concept that implements such techniques. My goal, though, would not be just a reimplementation of existing code, but possibly some more extensive work that combines techniques already used in other domains of II. Could this be an interesting Summer of Code proposal for the FreeBSD Foundation? I plan to write down some notes/ideas (and details) I have on a wiki starting from next week. Regards. From julian at elischer.org Sat Mar 21 10:00:27 2009 From: julian at elischer.org (Julian Elischer) Date: Sat Mar 21 10:00:34 2009 Subject: GSoC: Semantic File System In-Reply-To: <1fe1d5d60903210422g70efef15hdd685695cdf8df3c@mail.gmail.com> References: <1fe1d5d60903210422g70efef15hdd685695cdf8df3c@mail.gmail.com> Message-ID: <49C519FF.6010006@elischer.org> Gabriele Modena wrote: > Hello, > I am an AI master student at the university of Amsterdam. > > On of my current research interests lays in the area of information > retrieval and I would like to do a project > within my University research group starting next june. > > I am actually studying background literature about semantic filesystem > and information retrieval over local files. > > Being also quite interested in kernel development, I would like to > propose a proof of concept that implements such techniques. > My goal, though, would not be just a reimplementation of existing > code, but possibly some more extensive work > that combines techniques already used in other domains of II. > > Could this be an interesting Summer of Code proposal for the FreeBSD Foundation? > > I plan to write down some notes/ideas (and details) I have on a wiki > starting from next week. It sounds like something that would at least be worth following further. For myself I wouldn't mind knowing a bit more about what hind of "semantic filesystem" techniques you would mean to implement but that is just my own curiosity, (and, I admit it, complete lack of knowledge in that area (pointers welcome :-) ) ) Julian > > > Regards. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From ciphwn at gmail.com Sat Mar 21 12:36:39 2009 From: ciphwn at gmail.com (Cipta H) Date: Sat Mar 21 12:36:46 2009 Subject: sysctl returning XML Message-ID: <13b997e60903211236g26e1449dve34712fab7709748@mail.gmail.com> Hello all, I'm interested in parsing sysctl output into a program. Now I've heard from here a while ago that some sysctl OIDs can return data in XML format, but so far I have only found one example, kern.geom.confxml. Are there any others that anyone happens to know about, especially in networking? Thanks. Cipta From ivoras at freebsd.org Sat Mar 21 15:16:48 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Sat Mar 21 15:16:55 2009 Subject: sysctl returning XML In-Reply-To: <13b997e60903211236g26e1449dve34712fab7709748@mail.gmail.com> References: <13b997e60903211236g26e1449dve34712fab7709748@mail.gmail.com> Message-ID: Cipta H wrote: > Hello all, > > I'm interested in parsing sysctl output into a program. Now I've heard > from here a while ago that some sysctl OIDs can return data in XML > format, but so far I have only found one example, kern.geom.confxml. > Are there any others that anyone happens to know about, especially in > networking? Thanks. No, not in networking. XML is returned mostly in new sysctls (i.e. for subsystems that are relatively new, recently written). In 8-CURRENT there's kern.sched.topology_spec . -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090321/08258a2d/signature.pgp From babkin at verizon.net Sun Mar 22 09:12:11 2009 From: babkin at verizon.net (Sergey Babkin) Date: Sun Mar 22 09:22:40 2009 Subject: GSoC: Semantic File System Message-ID: <81564424.534529.1237734712737.JavaMail.root@vms181.mailsrvcs.net> Sorry if this sounds like s tupid suggestion, but have you thought abou prototype first? It's usually much easier to deve Then after the features get worked out, move it into the -SB Mar 21, 2009 07:51:18 AM, [1]gabriele.mod Hello, I am an AI master student at the university of On of my current research interests lays in the area of i nformation retrieval and I would like to do a project within my Unive I am actually studying back filesystem and information retrieval ov Being also quite interested in kernel development, I propose a proof of concept that implements such technique My goal, though, would not be just a reimplementation of existing that combines techniques alr Could this be an interesting Summe FreeBSD Foundation? I plan to write down wiki starting from next week. Regards. _______________________________________________ [3]http://lists.freebsd.org/mailman/listinfo/fr To unsubscribe, send any mail to "[4]freebsd-hackers-unsubscribe@freebsd.org" References 1. 3D"mailto:gabr 2. 3D"mailto:freebsd-hackers@freebsd.org" 3. 3D"http://lists.freebsd.org/mailman/listinfo/freebsd-hackers" 4. 3D"mailto:fr From rwatson at FreeBSD.org Sun Mar 22 09:52:55 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Mar 22 09:53:01 2009 Subject: GSoC: Semantic File System In-Reply-To: <1fe1d5d60903210422g70efef15hdd685695cdf8df3c@mail.gmail.com> References: <1fe1d5d60903210422g70efef15hdd685695cdf8df3c@mail.gmail.com> Message-ID: On Sat, 21 Mar 2009, Gabriele Modena wrote: > I am an AI master student at the university of Amsterdam. > > On of my current research interests lays in the area of information > retrieval and I would like to do a project within my University research > group starting next june. > > I am actually studying background literature about semantic filesystem and > information retrieval over local files. > > Being also quite interested in kernel development, I would like to propose a > proof of concept that implements such techniques. My goal, though, would not > be just a reimplementation of existing code, but possibly some more > extensive work that combines techniques already used in other domains of II. > > Could this be an interesting Summer of Code proposal for the FreeBSD > Foundation? > > I plan to write down some notes/ideas (and details) I have on a wiki > starting from next week. Hi Gabriele-- We are certainly not uninterested in projects along these lines, but I think the trick will be creating a convincing proposal that argues that (a) you can do the work in a summer, (b) there's a compelling usage case for including the results in FreeBSD, and (c) find a mentor who can supervise you in this project. What sort of semantic file system do you have in mind? How would you feel about a middle-ground project along the lines of Mac OS X Spotlight or similar efficient userspace indexing of a file system based on feedback from the file system about what has changed, or something BeOS-like, in which indexing takes place for extended attributes rather than for contents? Robert N M Watson Computer Laboratory University of Cambridge From chuckr at telenix.org Sun Mar 22 14:38:35 2009 From: chuckr at telenix.org (Chuck Robey) Date: Sun Mar 22 14:38:55 2009 Subject: dbus causing my cvsup to fail Message-ID: <49C6AFD6.2070400@telenix.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This one completely mystifies me. I have a little script I use to cvsup, cvs update, and rebuild my system, and it's been hitting failures about once a month over the last 6 months. The failures are all fairly alike: cvsup fails to apply a delta to one of the text files in /home/ncvs/usr/ports (not the subdirs, things like UPDATING only). Generally I don't notice this until hours later, when I notice that my cvs update of that file failed, and I've been fixing it by deleting the affected file in my /home/ncvs, and re-running my script. Just now, I happened to run it while I was watching, and I caught this coming from my messages, it's obviously from the same processes (my machine is named "april"): Mar 22 17:03:32 april dbus-daemon: Would reject message, 1 matched rules; type="method_call", sender=":1.59" (uid=1001 pid=42493 comm=") interface="org.freedesktop.DBus.Introspectable" member="Introspect" error name="(unset)" requested_reply=0 destination="org.freedesktop.Hal" (uid=0 pid=1202 comm=")) I lost the actual error message, but I recall that the 1.59 was part of the string defining the delta. The rest of it I'm blaming on dbus due to the very coincidental timing, exactly the same as the error. I wish I'd kept the original error, but it's gone now, it didn't go into messages. So, somehow, dbus is causing this, but I'm not on particularly good terms with dbus & hal, so I can't tell where to go to start fixing this. Anyone get anything from this error message? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAknGr9YACgkQz62J6PPcoOkabQCdEE6Icqcs13cZpc8W8OkPgJ7B RaAAoIIepuLb6eXAurn/iGicQNNXqCVl =rHj3 -----END PGP SIGNATURE----- From spawk at acm.poly.edu Sun Mar 22 20:00:54 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Sun Mar 22 20:01:00 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? Message-ID: <49C6F4F4.5030609@acm.poly.edu> Ahoy. I got bitten by this today--a system I administer for someone had users in more than 16 groups, so I had to bump the value, recompile the kernel, and reboot. It seems desirable to (at the very least) make this a read-only tunable that can be set using /boot/loader.conf, so as to avoid source modification and kernel recompilation. I had a look around, and noticed that NGROUPS_MAX is used to construct static arrays in a couple of locations ("ibcs2_gid_t iset[NGROUPS_MAX];"). It seems that malloc(9)/MALLOC(9) can be used to allocate memory for the array instead, and panic() (or something) can be called if the allocation fails, no? Is that about the gist of it? If I'm not overlooking something major, I'd like to take a stab at it. -Boris From varshney.ruchi at gmail.com Sun Mar 22 18:35:11 2009 From: varshney.ruchi at gmail.com (Ruchi Varshney) Date: Sun Mar 22 20:20:32 2009 Subject: AVR-GCC compiler options Message-ID: Hi,I am looking for a way to intermix source code with the asm code generated when I compile a .c file "avr-gcc -S" option. Right now, I know that when I use "avr-objdump -S" on the .s file obtained from avr-gcc, I can see that the output is intermixes with the actual source code from the .c file. Is there a way I can get the source code to appear in the .s file when I use "avr-gcc"? Thanks Ruchi From ota at j.email.ne.jp Sun Mar 22 20:53:02 2009 From: ota at j.email.ne.jp (Yoshihiro Ota) Date: Sun Mar 22 20:53:08 2009 Subject: 2 uni-directional TCP connection good? In-Reply-To: References: <20090320045319.04484fc5.ota@j.email.ne.jp> Message-ID: <20090322235253.432874dd.ota@j.email.ne.jp> On Fri, 20 Mar 2009 13:24:09 +0000 (GMT) Robert Watson wrote: > > On Fri, 20 Mar 2009, Yoshihiro Ota wrote: > > > 1. With TCP connections, only sender side can detect some communication > > issues passively if happened. By using two connections, you lost that > > ability by your self. I agree on this one. > > Could you expand a bit on this point? While the connection creation process > (usually) asymmetric, once the connection is built it's essentially the same > state machine on both sides of the connection, and socket semantics with > respect to the state machine are effectively identical. Application on both > sides should be able to detect disconnect, monitor connection state using > TCP_INFO, etc. What I meant was that there were cases when a receiver could not tell weather no data was coming or communication was interrupted. Once connection is established, a route is available between a server and a client. Let's say this route is broken for some reasons, i.e. someone unplugged a cable or a firewall started dropping or rejecting between these server and client, a sender may not notice as soon as it happens but at least, a sender knows a massages was not delivered right. On the other hand, receiver side does not have any idea that a message delivery failure has happened at all or for a while unless using heartbeat messages in upper layer. KEEP_ALIVE option seems to be implementation dependent such that you cannot assure TCP connection availability for every minute. Thanks, Hiro From das at FreeBSD.ORG Sun Mar 22 21:57:27 2009 From: das at FreeBSD.ORG (David Schultz) Date: Sun Mar 22 21:57:33 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <49C6F4F4.5030609@acm.poly.edu> References: <49C6F4F4.5030609@acm.poly.edu> Message-ID: <20090323043937.GA61818@zim.MIT.EDU> On Sun, Mar 22, 2009, Boris Kochergin wrote: > Ahoy. I got bitten by this today--a system I administer for someone had > users in more than 16 groups, so I had to bump the value, recompile the > kernel, and reboot. It seems desirable to (at the very least) make this > a read-only tunable that can be set using /boot/loader.conf, so as to > avoid source modification and kernel recompilation. I had a look around, > and noticed that NGROUPS_MAX is used to construct static arrays in a > couple of locations ("ibcs2_gid_t iset[NGROUPS_MAX];"). It seems that > malloc(9)/MALLOC(9) can be used to allocate memory for the array > instead, and panic() (or something) can be called if the allocation > fails, no? Is that about the gist of it? If I'm not overlooking > something major, I'd like to take a stab at it. There's already a kern.ngroups sysctl, but there are many places where `ngroups' needs to be used in preference to NGROUPS in the kernel. In userland, sysconf(_SC_NGROUPS_MAX) needs to be used in preference to NGROUPS_MAX. From doconnor at gsoft.com.au Sun Mar 22 23:06:27 2009 From: doconnor at gsoft.com.au (Daniel O'Connor) Date: Sun Mar 22 23:06:34 2009 Subject: AVR-GCC compiler options In-Reply-To: References: Message-ID: <200903231636.17259.doconnor@gsoft.com.au> On Monday 23 March 2009 11:38:01 Ruchi Varshney wrote: > Hi,I am looking for a way to intermix source code with the asm code > generated when I compile a .c file "avr-gcc -S" option. > Right now, I know that when I use "avr-objdump -S" on the .s file obtained > from avr-gcc, I can see that the output is intermixes with the actual > source code from the .c file. Is there a way I can get the source code to > appear in the .s file when I use "avr-gcc"? You'd be better off asking this question on the avr-gcc list. You can pass -Wa,-adhlmsn=foo.lst to gcc and it will make a .lst which will do what I think you want.. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090323/46a4f3fa/attachment.pgp From perryh at pluto.rain.com Mon Mar 23 00:56:09 2009 From: perryh at pluto.rain.com (perryh@pluto.rain.com) Date: Mon Mar 23 00:56:20 2009 Subject: 2 uni-directional TCP connection good? In-Reply-To: <20090322235253.432874dd.ota@j.email.ne.jp> References: <20090320045319.04484fc5.ota@j.email.ne.jp> <20090322235253.432874dd.ota@j.email.ne.jp> Message-ID: <49c7381d.eJH7/fiaDJB9Gr6c%perryh@pluto.rain.com> > What I meant was that there were cases when a receiver could not > tell weather no data was coming or communication was interrupted. > Once connection is established, a route is available between a > server and a client. Let's say this route is broken for some > reasons, i.e. someone unplugged a cable or a firewall started > dropping or rejecting between these server and client, a sender > may not notice as soon as it happens but at least, a sender knows > a massages was not delivered right. On the other hand, receiver > side does not have any idea that a message delivery failure has > happened at all or for a while unless using heartbeat messages > in upper layer. KEEP_ALIVE option seems to be implementation > dependent such that you cannot assure TCP connection availability > for every minute. The whole point of TCP (vs IP alone, or UDP) is to establish reliable end-to-end communication over unreliable underlying links. If a packet is corrupted or lost, it gets resent. If a route goes down, and an alternate is available, TCP will -- eventually -- find it and recover. If the last (or only) route goes down, TCP will in principle wait indefinitely for a route to become available, whether by reestablishment of the original or provision of an alternative. So you are correct that a receiver can't tell the difference between a loss of connectivity and the sender having crashed, however the situation is entirely symmetric: the sender can't tell the difference either. It all gets sorted out when communication is reestablished; at that point traffic will resume (if the link had been down) or the uncrashed end will get a connection reset (if its peer had crashed). The practice of sending keep-alive packets simply converts a temporary (thus potentially recoverable) communication loss into what amounts to an unrecoverable crash of whichever end gets impatient first. From rwatson at FreeBSD.org Mon Mar 23 01:20:21 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Mon Mar 23 01:20:28 2009 Subject: 2 uni-directional TCP connection good In-Reply-To: <20090322235253.432874dd.ota@j.email.ne.jp> References: <20090320045319.04484fc5.ota@j.email.ne.jp> <20090322235253.432874dd.ota@j.email.ne.jp> Message-ID: On Sun, 22 Mar 2009, Yoshihiro Ota wrote: >> On Fri, 20 Mar 2009, Yoshihiro Ota wrote: >> >>> 1. With TCP connections, only sender side can detect some communication >>> issues passively if happened. By using two connections, you lost that >>> ability by your self. I agree on this one. >> >> Could you expand a bit on this point? While the connection creation >> process (usually) asymmetric, once the connection is built it's essentially >> the same state machine on both sides of the connection, and socket >> semantics with respect to the state machine are effectively identical. >> Application on both sides should be able to detect disconnect, monitor >> connection state using TCP_INFO, etc. > > What I meant was that there were cases when a receiver could not tell > weather no data was coming or communication was interrupted. Once > connection is established, a route is available between a server and a > client. Let's say this route is broken for some reasons, i.e. someone > unplugged a cable or a firewall started dropping or rejecting between these > server and client, a sender may not notice as soon as it happens but at > least, a sender knows a massages was not delivered right. On the other > hand, receiver side does not have any idea that a message delivery failure > has happened at all or for a while unless using heartbeat messages in upper > layer. KEEP_ALIVE option seems to be implementation dependent such that you > cannot assure TCP connection availability for every minute. This is generally considered a robustness property rather than a fragility issue, but yes: if you need a liveliness property for idle connections with TCP, it's something you have to implement at the application layer, and many protocols indeed do this. I don't see that this is an argument for using two TCP connections as opposed to one, however. If you're interested in alternative protocols, however, SCTP allows a number of these protocol behaviors to be modified, and includes support for a heartbeat. Robert N M Watson Computer Laboratory University of Cambridge From ttw+bsd at cobbled.net Mon Mar 23 05:57:53 2009 From: ttw+bsd at cobbled.net (ttw+bsd@cobbled.net) Date: Mon Mar 23 05:58:01 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <20090323043937.GA61818@zim.MIT.EDU> References: <49C6F4F4.5030609@acm.poly.edu> <20090323043937.GA61818@zim.MIT.EDU> Message-ID: <20090323125110.GB8686@holyman.cobbled.net> On 23.03-00:39, David Schultz wrote: [ ... ] > There's already a kern.ngroups sysctl, but there are many places > where `ngroups' needs to be used in preference to NGROUPS in the > kernel. In userland, sysconf(_SC_NGROUPS_MAX) needs to be used in > preference to NGROUPS_MAX. you will also note that, as you look at this more, NGROUPS_MAX controls very little regarding the relevant buffers and, generally, without reviewing it again to be specific i'd suggest that you may expose a number of buffer overruns but will most certainally not get the 'correct' behaviour from the change. i.e. removing NGROUPS_MAX may remove an error message from setgroups but will not increase the buffer allocations or alter relevant code to check NGROUPS_MAX correctly. From ttw+bsd at cobbled.net Mon Mar 23 06:11:45 2009 From: ttw+bsd at cobbled.net (ttw+bsd@cobbled.net) Date: Mon Mar 23 06:11:52 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <49C6F4F4.5030609@acm.poly.edu> References: <49C6F4F4.5030609@acm.poly.edu> Message-ID: <20090323124502.GA8686@holyman.cobbled.net> On 22.03-22:33, Boris Kochergin wrote: > Ahoy. I got bitten by this today--a system I administer for someone had > users in more than 16 groups, so I had to bump the value, recompile the > kernel, and reboot. It seems desirable to (at the very least) make this > a read-only tunable that can be set using /boot/loader.conf, so as to > avoid source modification and kernel recompilation. I had a look around, > and noticed that NGROUPS_MAX is used to construct static arrays in a > couple of locations ("ibcs2_gid_t iset[NGROUPS_MAX];"). It seems that > malloc(9)/MALLOC(9) can be used to allocate memory for the array > instead, and panic() (or something) can be called if the allocation > fails, no? Is that about the gist of it? If I'm not overlooking > something major, I'd like to take a stab at it. i've sumbitted a patch for this to hackers@' list but actually bumping the groups limit is more work. i'm pretty far on with it but am unsure wwhen it'll be completed. if anyone wishes a copy of the patches or current working patch then i'd be happy to post it. note that bumping NGROUPS_MAX will do little in itself. From spawk at acm.poly.edu Mon Mar 23 07:21:06 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Mon Mar 23 07:21:12 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <20090323124502.GA8686@holyman.cobbled.net> References: <49C6F4F4.5030609@acm.poly.edu> <20090323124502.GA8686@holyman.cobbled.net> Message-ID: <49C79A9B.9070309@acm.poly.edu> ttw+bsd@cobbled.net wrote: > On 22.03-22:33, Boris Kochergin wrote: > >> Ahoy. I got bitten by this today--a system I administer for someone had >> users in more than 16 groups, so I had to bump the value, recompile the >> kernel, and reboot. It seems desirable to (at the very least) make this >> a read-only tunable that can be set using /boot/loader.conf, so as to >> avoid source modification and kernel recompilation. I had a look around, >> and noticed that NGROUPS_MAX is used to construct static arrays in a >> couple of locations ("ibcs2_gid_t iset[NGROUPS_MAX];"). It seems that >> malloc(9)/MALLOC(9) can be used to allocate memory for the array >> instead, and panic() (or something) can be called if the allocation >> fails, no? Is that about the gist of it? If I'm not overlooking >> something major, I'd like to take a stab at it. >> > > i've sumbitted a patch for this to hackers@' list but actually > bumping the groups limit is more work. i'm pretty far on with it > but am unsure wwhen it'll be completed. if anyone wishes a copy of > the patches or current working patch then i'd be happy to post it. > > note that bumping NGROUPS_MAX will do little in itself. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > Well, bumping it does get rid of messages like: Mar 22 20:44:26 hydrogen sshd[96152]: getgrouplist: groups list too small Mar 22 20:44:26 hydrogen sshd[96152]: fatal: initgroups: [user]: Invalid argument ...and allows users who are in more than 16 groups to log in. I think there's something to be said for that. Anyway, thanks for the update. I'd love to see a resolution to this other than having to recompile the kernel. Let me know if I can help things along somehow. -Boris From avg at icyb.net.ua Mon Mar 23 08:34:13 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Mon Mar 23 08:34:20 2009 Subject: fdc io ports Message-ID: <49C7ABF1.3060808@icyb.net.ua> There is the following verbose and helpful comment in fdc_isa.c: > On standard ISA, we don't just use an 8 port range > (e.g. 0x3f0-0x3f7) since that covers an IDE control register at > 0x3f6. So, on older hardware, we use 0x3f0-0x3f5 and 0x3f7. > However, some BIOSs omit the control port, while others start at > 0x3f2. Of the latter, sometimes we have two resources, other times > we have one. We have to deal with the following cases: > > 1: 0x3f0-0x3f5 # very rare > 2: 0x3f0 # hints -> 0x3f0-0x3f5,0x3f7 > 3: 0x3f0-0x3f5,0x3f7 # Most common > 4: 0x3f2-0x3f5,0x3f7 # Second most common > 5: 0x3f2-0x3f5 # implies 0x3f7 too. > 6: 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 # becoming common > 7: 0x3f2-0x3f3,0x3f4-0x3f5 # rare > 8: 0x3f0-0x3f1,0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 > 9: 0x3f0-0x3f3,0x3f4-0x3f5,0x3f7 Looking in fdc.c it seems that it never uses ports at offsets 0, 1 and 3: > #define FDOUT 2 /* Digital Output Register (W) */ ... > #define FDSTS 4 /* NEC 765 Main Status Register (R) */ > #define FDDSR 4 /* Data Rate Select Register (W) */ > #define FDDATA 5 /* NEC 765 Data Register (R/W) */ > #define FDCTL 7 /* Control Register (W) */ On my system ACPIreserves 0x3f2-0x3f5,0x3f7 ("second most common" above). In i386/amd64 world, it seems that the only reason to specify hint.fdc.0.port="0x3F0" in GENERIC.hints is to say "we actually do not use 0x3f0 and 0x3f1 ports, but we guess that they might affect fdc, so we'll reserve them just in case". Do we really need to do this over-safety? The only reason for me asking is that I am hacking on a driver for a Super I/O chip that actually uses 0x3f0 and 0x3f1 ports and there is a resource conflict with fdc when ACPI is disabled. It's not an issue, but I thought that we could free up those ports. -- Andriy Gapon From ttw+bsd at cobbled.net Mon Mar 23 08:54:36 2009 From: ttw+bsd at cobbled.net (n0g0013) Date: Mon Mar 23 08:54:42 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <49C79A9B.9070309@acm.poly.edu> References: <49C6F4F4.5030609@acm.poly.edu> <20090323124502.GA8686@holyman.cobbled.net> <49C79A9B.9070309@acm.poly.edu> Message-ID: <20090323155433.GA24517@holyman.cobbled.net> On 23.03-10:20, Boris Kochergin wrote: [ ... ] > Well, bumping it does get rid of messages like: > > Mar 22 20:44:26 hydrogen sshd[96152]: getgrouplist: groups list too small > Mar 22 20:44:26 hydrogen sshd[96152]: fatal: initgroups: [user]: Invalid > argument yes, that's great but you may be surprised to learn that it doesn't actually solve your problem. i think (and without looking specifically at the impact my even be confident enough to say definately) if you get a groups list it will only be cropped and the error message is being erroneously avoided, not corrected. i'd also suggest that you may be opening up your system to some overflows although, generally, the code sections use the same limits and so you might get away with it. [ ... ] > I'd love to see a resolution to this other than having to recompile the > kernel. Let me know if I can help things along somehow. if you can grab my patch, confirm it builds for you and that it doesn't crash your system , that would be a big help. unfortunately i was going to test it on my xen box only to discover that it doesn't work with amd64 yet. i'm currently coding blind and am not a good programmer so this is bad[tm]. if you can do this and are happy to run a few further tests after that then i'll be sure to put some heat under the rest of the process and get the group limits removed correctly. -- t t w From spawk at acm.poly.edu Mon Mar 23 09:24:44 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Mon Mar 23 09:24:52 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <20090323155433.GA24517@holyman.cobbled.net> References: <49C6F4F4.5030609@acm.poly.edu> <20090323124502.GA8686@holyman.cobbled.net> <49C79A9B.9070309@acm.poly.edu> <20090323155433.GA24517@holyman.cobbled.net> Message-ID: <49C7B793.1090308@acm.poly.edu> n0g0013 wrote: > On 23.03-10:20, Boris Kochergin wrote: > [ ... ] > >> Well, bumping it does get rid of messages like: >> >> Mar 22 20:44:26 hydrogen sshd[96152]: getgrouplist: groups list too small >> Mar 22 20:44:26 hydrogen sshd[96152]: fatal: initgroups: [user]: Invalid >> argument >> > > yes, that's great but you may be surprised to learn that it doesn't > actually solve your problem. i think (and without looking > specifically at the impact my even be confident enough to say > definately) if you get a groups list it will only be cropped and the > error message is being erroneously avoided, not corrected. i'd also > suggest that you may be opening up your system to some overflows > although, generally, the code sections use the same limits and so > you might get away with it. > > [ ... ] > >> I'd love to see a resolution to this other than having to recompile the >> kernel. Let me know if I can help things along somehow. >> > > if you can grab my patch, confirm it builds for you and that it doesn't > crash your system , that would be a big help. unfortunately i was > going to test it on my xen box only to discover that it doesn't work > with amd64 yet. i'm currently coding blind and am not a good > programmer so this is bad[tm]. > > if you can do this and are happy to run a few further tests after that > then i'll be sure to put some heat under the rest of the process and > get the group limits removed correctly. > > On my 7.0 system, and a kernel recompiled with NGROUPS_MAX set to 64, a getgrouplist() call for a user who is in more than 16 groups (24, to be exact) will populate the array specified by the "gid_t *groups" argument with the 24 groups the user is in, in addition to the group specified in the "gid_t basegid" argument. The value of the variable specified in the "int *ngroups" will also be 25, and the getgrouplist() call will return 0. So, as far as being a hack for a specific problem, it seems to work properly. Sure, I'll test the patch. Can you point me at it? -Boris From pisymbol at gmail.com Mon Mar 23 09:54:49 2009 From: pisymbol at gmail.com (Alexander Sack) Date: Mon Mar 23 09:54:55 2009 Subject: Long double support in FreeBSD? Message-ID: <3c0b01820903230930q1b54f9a5p38f4d6d230a350c7@mail.gmail.com> Hello: I'm working with building the Boost libraries and Boost.Math has long double support stubbed out for FreeBSD (personally I don't need it but..). I believe looking at some historical threads about this over the weekend and a lot of it was due to compiler GNUish bugs handling long double math (I believe Bruce Evans had some patches at one point but mentioned it was still crappy). Can someone speak if the current compiler/BSD flavors support long double math on a 64-bit capable CPU (LM=1)? Thanks! -aps From das at FreeBSD.ORG Mon Mar 23 11:01:19 2009 From: das at FreeBSD.ORG (David Schultz) Date: Mon Mar 23 11:01:26 2009 Subject: Long double support in FreeBSD? In-Reply-To: <3c0b01820903230930q1b54f9a5p38f4d6d230a350c7@mail.gmail.com> References: <3c0b01820903230930q1b54f9a5p38f4d6d230a350c7@mail.gmail.com> Message-ID: <20090323180327.GA8943@zim.MIT.EDU> On Mon, Mar 23, 2009, Alexander Sack wrote: > I'm working with building the Boost libraries and Boost.Math has long > double support stubbed out for FreeBSD (personally I don't need it > but..). I believe looking at some historical threads about this over > the weekend and a lot of it was due to compiler GNUish bugs handling > long double math (I believe Bruce Evans had some patches at one point > but mentioned it was still crappy). > > Can someone speak if the current compiler/BSD flavors support long > double math on a 64-bit capable CPU (LM=1)? Long doubles are supported, except that long double versions of the following libm functions are missing: acoshl asinhl atanhl cbrtl coshl erfcl erfl expl expm1l lgammal log10l log1pl log2l logl powl sinhl tanhl tgammal The only other caveat is that on i386 we set the FPU to 53-bit precision so that gcc produces saner results in double precision. (See the archives for the gruesome details.) Of course, if you're running FreeBSD/amd64 on a 64-bit machine, this doesn't apply. From pisymbol at gmail.com Mon Mar 23 11:22:12 2009 From: pisymbol at gmail.com (Alexander Sack) Date: Mon Mar 23 11:22:19 2009 Subject: Long double support in FreeBSD? In-Reply-To: <20090323180327.GA8943@zim.MIT.EDU> References: <3c0b01820903230930q1b54f9a5p38f4d6d230a350c7@mail.gmail.com> <20090323180327.GA8943@zim.MIT.EDU> Message-ID: <3c0b01820903231122mb763be4geb07cafecc80db1b@mail.gmail.com> On Mon, Mar 23, 2009 at 2:03 PM, David Schultz wrote: > On Mon, Mar 23, 2009, Alexander Sack wrote: >> I'm working with building the Boost libraries and Boost.Math has long >> double support stubbed out for FreeBSD (personally I don't need it >> but..). ?I believe looking at some historical threads about this over >> the weekend and a lot of it was due to compiler GNUish bugs handling >> long double math (I believe Bruce Evans had some patches at one point >> but mentioned it was still crappy). >> >> Can someone speak if the current compiler/BSD flavors support long >> double math on a 64-bit capable CPU (LM=1)? > > Long doubles are supported, except that long double versions of > the following libm functions are missing: > > ? ?acoshl asinhl atanhl cbrtl coshl erfcl erfl expl expm1l > ? ?lgammal log10l log1pl log2l logl powl sinhl tanhl tgammal > > The only other caveat is that on i386 we set the FPU to 53-bit > precision so that gcc produces saner results in double precision. > (See the archives for the gruesome details.) Of course, if you're > running FreeBSD/amd64 on a 64-bit machine, this doesn't apply. > Thank you so much David, that is what I needed to know (I just thought asking would be easier in this case than trying to parse through the many threads over the past about this topic). -aps From chris at young-alumni.com Mon Mar 23 11:34:27 2009 From: chris at young-alumni.com (Chris Ruiz) Date: Mon Mar 23 11:34:40 2009 Subject: ETA for ZFS v. 13 Merge From HEAD ? In-Reply-To: <5f67a8c40903161109le12b8afuc25b8c1ec1b6f70c@mail.gmail.com> References: <78cb3d3f0903151209r46837d70m914a23e30a19060e@mail.gmail.com> <4AE4493D5E9141E8812E4BC83FB5A2A5@PegaPegII> <5f67a8c40903161109le12b8afuc25b8c1ec1b6f70c@mail.gmail.com> Message-ID: <9DCF097D-421A-4F5F-8A48-D0286551C62C@young-alumni.com> On Mar 16, 2009, at 1:09 PM, Zaphod Beeblebrox wrote: > On Sun, Mar 15, 2009 at 6:39 PM, Pegasus Mc Cleaft > wrote: > >> Hi Adrian, >> >> I am not sure, but I didnt think ZFS 13 was ever going to be >> merged into >> 7-stable. I thought the kernel memory requirements were to great >> (just going >> back in my memory on that one). Also, I think there are still a few >> bugs >> left with the zil being enabled (and/or prefetch) causing lockups >> on machine >> with a lot of IO. I know I have hit that bug a few times on my >> machine when >> using various torrent clients when they want to preallocate large >> amounts of >> diskspace. >> >> I personally cant wait until a later version of ZFS is imported that >> supports encryption. I can finally say good-bye to our GEOM ELI USB >> drives >> for backups!! Never the less, I am quite thankfull to thoes >> involved in >> porting V13 to FreeBSD. Its a wonderfull improvement and my FS of >> choice >> when installing on new machines (especially zfs boot) > > > I think that you're touching on two entirely separate points here... > What it > takes to upgrade ZFS in -STABLE and what it takes to bring ZFS > modules in to > FreeBSD. > > I sincerely hope that ZFSv13 is planned for -STABLE. Last we left > this > issue, testing and a few kernel improvements were in the way. None > of the > kernel improvements were going to change the API, so the project was > doable > in -STABLE. That said, time marches on, 8.0-RELEASE draws ever > nearer. > When we were still several years out on 8.0 and ZFS was causing me > more > problems, I was much more keen to push for the port. I would still > welcome > it with open arms, but I'm not convinced that anyone is going to > push it > forward. > > The issue of encryption (along with many other issues) is tied to the > ability of FreeBSD to compile and use ZFS modules. Just like netgraph > modules extend the function of netgraph.ko and geom modules extend > the base > geom function, ZFS is designed (in Solaris, at least) to take > modules. ZFS > encryption is a module. I'm not clear on compression --- it would > make > sense that it is a module, but it seemingly got copied into FreeBSD > as a > core feature (and it may also be so in solaris). > > Anyways... is there any plans to allow for ZFS modules in FreeBSD? AFAIK ZFS v13 requires changes to the kernel that would break the ABI, which is not allowed to change in a STABLE branch. With 8.0 coming within the next 6 months, I doubt that 7 will see a new version of ZFS. There are no problems running ZFS v13 with zil and prefetch enabled and I have not had any predictable out of kernel memory panics. For me, ZFS on CURRENT really is *that* much better. Also, OpenSolaris has yet to integrate ZFS on disk encryption into their source. The code is currently under review: http://opensolaris.org/os/project/zfs-crypto/ . OpenSolaris uses ZFS v14 now and on disk encryption will probably be synced to a newer version of ZFS, meaning that this would require another code sync with OpenSolaris. Chris From zbeeble at gmail.com Mon Mar 23 14:09:20 2009 From: zbeeble at gmail.com (Zaphod Beeblebrox) Date: Mon Mar 23 14:09:33 2009 Subject: ETA for ZFS v. 13 Merge From HEAD ? In-Reply-To: <9DCF097D-421A-4F5F-8A48-D0286551C62C@young-alumni.com> References: <78cb3d3f0903151209r46837d70m914a23e30a19060e@mail.gmail.com> <4AE4493D5E9141E8812E4BC83FB5A2A5@PegaPegII> <5f67a8c40903161109le12b8afuc25b8c1ec1b6f70c@mail.gmail.com> <9DCF097D-421A-4F5F-8A48-D0286551C62C@young-alumni.com> Message-ID: <5f67a8c40903231409q17fd0370wd2907525b8b6aff0@mail.gmail.com> On Mon, Mar 23, 2009 at 2:14 PM, Chris Ruiz wrote: > > AFAIK ZFS v13 requires changes to the kernel that would break the ABI, > which is not allowed to change in a STABLE branch. With 8.0 coming within > the next 6 months, I doubt that 7 will see a new version of ZFS. Can we have someone who actually knows comment on this requirement? The last time this was discussed, I didn't hear this conclusion --- that ZFS 13 required ABI breaking kernel changes. > Also, OpenSolaris has yet to integrate ZFS on disk encryption into their > source. The code is currently under review: > http://opensolaris.org/os/project/zfs-crypto/ . OpenSolaris uses ZFS v14 > now and on disk encryption will probably be synced to a newer version of > ZFS, meaning that this would require another code sync with OpenSolaris. It should be said that importing updates from OpenSolaris needs to be easier. From ab at addr.com Mon Mar 23 17:24:42 2009 From: ab at addr.com (Anthony Bourov) Date: Mon Mar 23 17:24:48 2009 Subject: nsdispatch performance issue for large group files Message-ID: <7D939245FB7E4AFCA6AD00BFE5E933E0@ABPC> Regarding performance of: lib/libc/net/nsdispatch.c When used from: lib/libc/net/getgrent.c (called by initgroups()) I don't normally post here but I wanted to make a suggestion on a performance issue that I spotted. I run a large number of high-volume web hosting servers and noticed on some of the servers a severe decrease in Apache's performance when the /etc/group file is large (over 100,000 entries in a group file as it is combined across servers). I did a trace and found the following operation: stat("/etc/nsswitch.conf", {st_mode=052, st_size=4503681233059861, ...}) = 0 repeating as many times as there is groups in the group file. I narrowed the problem down to where apache calls "initgroups()" before forking each process (nothing wrong here). And init groups goes through every entry in the group file using getgrent(), which in turn calls nsdispatch and which for every single call does "stat" on "/etc/nsswitch.conf" to see if it changed. This issue impacts different servers differently, on most of the SCSI servers this delays apache startup my maybe a minute, however, on a Dell SATA raid the "stat" command was significantly slower and caused everything to come to a halt for several minuted every time apache starts. In my opinion this is a very significant performance issue when working with large servers. Most programs, including apache, will call "initgroups()" for every time they fork, and it the group file is large this means as many "stat" requests on the file system as there are entries in the group file for every single fork() that the server does. For myself I just made it never test "stat" on "/etc/nsswitch.conf" after the first time since I know that file is never modified. However, a better solution would for nsdispatch realise in that case that it is being ran in batch mode and should not keep testing if the file has changed. This would effect both "getgrent" and "getpwent". From sean.bruno at dsl-only.net Mon Mar 23 17:25:11 2009 From: sean.bruno at dsl-only.net (Sean Bruno) Date: Mon Mar 23 17:25:18 2009 Subject: BSDCan Firewire Plugfest Message-ID: <1237852708.5106.6.camel@localhost.localdomain> Dan has accepted my proposal to have a plugfest at BSDCan on Friday night: http://www.bsdcan.org/2009/schedule/events/144.en.html I'm hoping that folks can bring their various devices and laptops to the occasion. Even if you don't have a Firewire device, swing by with your laptop and see if there's any new information to be gained from your machines. Sean From wsw1wsw2 at gmail.com Mon Mar 23 23:50:08 2009 From: wsw1wsw2 at gmail.com (Shaowei Wang (wsw)) Date: Mon Mar 23 23:50:15 2009 Subject: A patch of HPTIOP driver for 7.1-RELEASE In-Reply-To: <49710E4F.6020404@delphij.net> References: <2e566b9e0901070005s630c2212k44a0e59a1bcf69aa@mail.gmail.com> <49710E4F.6020404@delphij.net> Message-ID: <2e566b9e0903232328y45801f76lc6d64acb4fef3dc@mail.gmail.com> Hi, delphij The problem about FreeBSD-7.x-amd64's hptiop driver is solved by patching our RAID-manage software (userland utils). The hptrr driver is a soft RAID so a 32-bit compatibility ioctl structure is necessary. The hptiop is a hardware RAID controller, the firmware is 32-bit. I'm not so familiar with FreeBSD's development community. I'm sorry Posting the infomation here. On Sat, Jan 17, 2009 at 6:46 AM, Xin LI wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, Shaowei, > > It seems that I can not apply your patch directly, I have tried to do it > manually, as attached, please let me know if it's Ok. I can commit for > you against -HEAD if it looks fine and take care for MFC. > > Note that, however, I am more or less concerned about the driver if > 32-bit utility is running on amd64 platform. There seems to have three > pointer style field in hpt_iop_ioctl_param. I have checked hptrr(4) and > found that it has defined a 32-bit compatibility ioctl structure. > According to my understanding to hptiop(4), this could be a problem. > > PS. For faster handling it is probably a good idea to submit patch > through our PR system: http://www.freebsd.org/send-pr.html > > Shaowei Wang (wsw) wrote: > > Hi, guys > > > > hptiop driver in the 7.1 release has a little bug. > > Because this issue the Raid-manage GUI program which we provided can NOT > > work anymore. > > > > So we give the patch: > > > > Index: hptiop.h > > =================================================================== > > --- hptiop.h (revision 186851) > > +++ hptiop.h (working copy) > > @@ -260,7 +260,7 @@ > > unsigned long lpOutBuffer; /* output data buffer */ > > u_int32_t nOutBufferSize; /* size of output data > buffer > > */ > > unsigned long lpBytesReturned; /* count of HPT_U8s returned > */ > > -}; > > +}__attribute__((packed)); > > > > #define HPT_IOCTL_FLAG_OPEN 1 > > #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > > > > ==================================================================== > > > > -wsw > > > > > /************************************************************************/ > > > > ???? > > > > hptiop????7.1?????????? > > ???????????????????????? > > > > ???????? > > > > Index: hptiop.h > > =================================================================== > > --- hptiop.h (revision 186851) > > +++ hptiop.h (working copy) > > @@ -260,7 +260,7 @@ > > unsigned long lpOutBuffer; /* output data buffer */ > > u_int32_t nOutBufferSize; /* size of output data > buffer > > */ > > unsigned long lpBytesReturned; /* count of HPT_U8s returned > */ > > -}; > > +}__attribute__((packed)); > > > > #define HPT_IOCTL_FLAG_OPEN 1 > > #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > > > > ==================================================================== > > > > -wsw > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > freebsd-hackers@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > > To unsubscribe, send any mail to " > freebsd-hackers-unsubscribe@freebsd.org" > > > - -- > Xin LI http://www.delphij.net/ > FreeBSD - The Power to Serve! > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.10 (FreeBSD) > > iEYEARECAAYFAklxDk4ACgkQi+vbBBjt66CvUQCfaAnk0XQTh3Wrn2O4Dy0pEUFW > oqsAoIvlTSNGRDg71SNyGfZ5VjDh9Z93 > =1xSB > -----END PGP SIGNATURE----- > > Index: sys/dev/hptiop/hptiop.h > =================================================================== > --- sys/dev/hptiop/hptiop.h ??? 187338? > +++ sys/dev/hptiop/hptiop.h ?????? > @@ -260,7 +260,7 @@ > unsigned long lpOutBuffer; /* output data buffer */ > u_int32_t nOutBufferSize; /* size of output data > buffer */ > unsigned long lpBytesReturned; /* count of HPT_U8s returned > */ > -}; > +} __attribute__((packed)); > > #define HPT_IOCTL_FLAG_OPEN 1 > #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > > From samflanker at gmail.com Tue Mar 24 03:13:40 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Tue Mar 24 03:13:47 2009 Subject: [problem] aac0 does not respond Message-ID: <49C8AD9B.7000500@gmail.com> Hello, All Describe my problem: have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 2 HHD of 6 have errors in smart data (damaged) i am try read file /var/db/mysql/ibdata1 from this volume system does not respond ( lost access to ssh ) after read 6GB data from this file and print debug messages on ttyv0 As to prevent the emergence of this problem? As monitor the status of RAID-controller? please, any solutions /Vladimir Ermakov ==========================messages on ttyv0================================== Mar 22 20:20:12 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT AFTER 50 SECONDS Mar 22 20:20:12 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT AFTER 50 SECONDS Mar 22 20:20:12 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT AFTER 50 SECONDS Mar 22 20:20:32 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT AFTER 70 SECONDS Mar 22 20:20:32 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT AFTER 70 SECONDS Mar 22 20:20:32 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT AFTER 70 SECONDS Mar 22 20:20:52 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT AFTER 90 SECONDS Mar 22 20:20:52 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT AFTER 90 SECONDS Mar 22 20:20:52 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT AFTER 90 SECONDS Mar 22 20:21:12 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT AFTER 111 SECONDS Mar 22 20:21:12 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT AFTER 111 SECONDS Mar 22 20:21:12 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT AFTER 111 SECONDS =============================================================== # ls -halt /var/db/mysql/ibdata1 -rw-rw---- 1 88 88 256G Mar 22 23:23 /var/db/mysql/ibdata1 # tar -cf - /var/db/mysql/ibdata1 | pv -br > /dev/null 3.73GB [ 146MB/s] # smartctl -a -d scsi /dev/pass4 smartctl version 5.38 [amd64-portbld-freebsd7.1] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: FUJITSU MAX3147RC Version: 0104 Serial number: xxxxxxxxxxxxxxxxx Device type: <31> Transport protocol: SAS Local Time is: Tue Mar 24 10:07:08 2009 CET Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 21 C Drive Trip Temperature: 65 C Manufactured in week 18 of year 2006 Recommended maximum start stop count: 10000 times Current start stop count: 46 times Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 75782 1488 0 0 31950.874 1488 write: 0 567 0 0 0 12148.416 0 verify: 0 17642 960 0 0 10148.962 960 # uname -a FreeBSD sys3 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #1: Mon Nov 3 18:39:49 UTC 2008 root@sys3:/usr/obj/usr/src/sys/SYS3 amd64 # pciconf -lvc *** aac0@pci0:10:0:0: class=0x010400 card=0x02b69005 chip=0x02859005 rev=0x09 hdr=0x00 vendor = 'Adaptec Inc' device = 'AAC-RAID RAID Controller' class = mass storage subclass = RAID cap 01[98] = powerspec 2 supports D0 D1 D3 current D0 cap 05[a0] = MSI supports 2 messages, 64 bit cap 10[d0] = PCI-Express 1 endpoint cap 03[90] = VPD *** # dmesg | grep aac0 aac0: mem 0xb8a00000-0xb8bfffff irq 16 at device 0.0 on pci10 aac0: Enabling 64-bit address support aac0: Enable Raw I/O aac0: Enable 64-bit array aac0: New comm. interface enabled aac0: [ITHREAD] aac0: Adaptec 5805, aac driver 2.0.0-1 aacp0: on aac0 aacp1: on aac0 aacp2: on aac0 aacd0: on aac0 From attilio at freebsd.org Tue Mar 24 03:50:51 2009 From: attilio at freebsd.org (Attilio Rao) Date: Tue Mar 24 03:50:59 2009 Subject: [problem] aac0 does not respond In-Reply-To: <49C8AD9B.7000500@gmail.com> References: <49C8AD9B.7000500@gmail.com> Message-ID: <3bbf2fe10903240324t6616cc9dx6ae28028ac971be6@mail.gmail.com> 2009/3/24 Vladimir Ermakov : > Hello, All > > Describe my problem: > have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 > 2 HHD of 6 ?have errors in smart data (damaged) > i am try read file /var/db/mysql/ibdata1 from this volume > system does not respond ( lost access to ssh ) after read 6GB data from this > file > and print debug messages on ttyv0 > > As to prevent the emergence of this problem? > As monitor the status of RAID-controller? > > please, any solutions Is this -STABLE or -CURRENT? And if it is -CURRENT, what revision? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From ttw+bsd at cobbled.net Tue Mar 24 04:09:30 2009 From: ttw+bsd at cobbled.net (n0g0013) Date: Tue Mar 24 04:09:38 2009 Subject: Doing away with NGROUPS_MAX in src/sys/sys/syslimits.h? In-Reply-To: <49C7B793.1090308@acm.poly.edu> References: <49C6F4F4.5030609@acm.poly.edu> <20090323124502.GA8686@holyman.cobbled.net> <49C79A9B.9070309@acm.poly.edu> <20090323155433.GA24517@holyman.cobbled.net> <49C7B793.1090308@acm.poly.edu> Message-ID: <20090324110926.GA14099@holyman.cobbled.net> On 23.03-12:23, Boris Kochergin wrote: [ ... ] > >yes, that's great but you may be surprised to learn that it doesn't > >actually solve your problem. i think (and without looking > >specifically at the impact my even be confident enough to say > >definately) if you get a groups list it will only be cropped and the > >error message is being erroneously avoided, not corrected. i'd also > >suggest that you may be opening up your system to some overflows > >although, generally, the code sections use the same limits and so > >you might get away with it. [ ... ] > On my 7.0 system, and a kernel recompiled with NGROUPS_MAX set to 64, a > getgrouplist() call for a user who is in more than 16 groups (24, to be > exact) will populate the array specified by the "gid_t *groups" argument > with the 24 groups the user is in, in addition to the group specified in > the "gid_t basegid" argument. The value of the variable specified in the > "int *ngroups" will also be 25, and the getgrouplist() call will return > 0. So, as far as being a hack for a specific problem, it seems to work > properly. yeah, looked at it now. NGROUPS is defined from NGROUPS_MAX (bad memory). the other significant values would be KI_NGROUPS which is not defined from NGROUPS_MAX; neither are the IPC or RPC relevant values, although, as i said they use their own max values for validation (i.e. they don't suddenly using NGROUPS_MAX instead of CMGROUPS) so probably won't overflow trivially but i wouldn't say they are necissarily safe. suffice to say if it works for you great but be aware that you may have security and other issues associated with the change. [ ... ] > Sure, I'll test the patch. Can you point me at it? sure, attached. but note it's functionally zero progress, it only looks to remove the dependancy on NGROUPS_MAX as static definition and make SC_NGROUPS_MAX a writable and referenced value. however, it won't currently give you the extra groups you want because it defines other values from _NGROUP_COMPAT (which is 16) until i can complete the changes to a stable state. it would still be nice to know that i haven't messed it up completely and that, at the very least the system still boots and runs with it. p.s: i've gotta finish some bloody web stuff and then i'll throw some more time at it this afternoon. -- t t w -------------- next part -------------- Index: contrib/openpam/lib/openpam_borrow_cred.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/contrib/openpam/lib/openpam_borrow_cred.c,v retrieving revision 1.1.1.9 diff -b -u -r1.1.1.9 openpam_borrow_cred.c --- contrib/openpam/lib/openpam_borrow_cred.c 21 Dec 2007 11:49:29 -0000 1.1.1.9 +++ contrib/openpam/lib/openpam_borrow_cred.c 4 Feb 2009 16:38:46 -0000 @@ -60,6 +60,7 @@ struct pam_saved_cred *scred; const void *scredp; int r; + int ngroups ; ENTERI(pwd->pw_uid); r = pam_get_data(pamh, PAM_SAVED_CRED, &scredp); @@ -73,26 +74,55 @@ (int)geteuid()); RETURNC(PAM_PERM_DENIED); } - scred = calloc(1, sizeof *scred); - if (scred == NULL) - RETURNC(PAM_BUF_ERR); - scred->euid = geteuid(); - scred->egid = getegid(); - r = getgroups(NGROUPS_MAX, scred->groups); - if (r < 0) { - FREE(scred); - RETURNC(PAM_SYSTEM_ERR); - } - scred->ngroups = r; +/* get the maximum number of system groups */ +#if _POSIX_VERSION > 199212 + ngroups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + ngroups = NGROUPS_MAX ; +#else + ngroups = _NGROUPS_COMPAT ; +#endif +/* initally allocate enough memory for max_groups */ + scred = malloc( sizeof(struct pam_saved_cred) + + ngroups*sizeof(gid_t) ) ; + if( scred == NULL ) + RETURNC( PAM_BUF_ERR ) ; +/* set the save values */ + scred->euid = geteuid() ; + scred->egid = getegid() ; +/* save groups into our (probably) oversized memory allocation */ + r = getgroups( ngroups, scred->groups ) ; + if( r < 0 ) { + FREE( scred ) ; /* call PAM's free macro */ + RETURNC( PAM_SYSTEM_ERR ) ; + } ; + scred->ngroups = r ; + ngroups = r < ngroups ? r : ngroups ; /* choose the smallest */ + /* ... number of groups to allocate */ + ngroups = ngroups < _NGROUPS_COMPAT ? ngroups : _NGROUPS_COMPAT ; + /* but keep it within expected minimum value */ + /* XXX: we don't really want this but until we get + * educated on the implications this is probably safe + * and certainaly compatible */ +/* realloc, releasing unneeded memory */ + scred = realloc( (void*)scred, + sizeof(struct pam_saved_cred)+ngroups*sizeof(gid_t) ) ; + /* nb: we ignore failure and try to store the larger + * ... structure as initially requested. catching the + * ... error in 'pam_set_data' if neccessary. */ +/* save the credentials to PAM user data area */ r = pam_set_data(pamh, PAM_SAVED_CRED, scred, &openpam_free_data); if (r != PAM_SUCCESS) { FREE(scred); RETURNC(r); } +/* set the new credentials */ if (geteuid() == pwd->pw_uid) RETURNC(PAM_SUCCESS); if (initgroups(pwd->pw_name, pwd->pw_gid) < 0 || - setegid(pwd->pw_gid) < 0 || seteuid(pwd->pw_uid) < 0) { + setegid(pwd->pw_gid) < 0 || seteuid(pwd->pw_uid) < 0) + { + /* if any of the set calls failed, then restore and fail */ openpam_restore_cred(pamh); RETURNC(PAM_SYSTEM_ERR); } Index: contrib/openpam/lib/openpam_impl.h =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/contrib/openpam/lib/openpam_impl.h,v retrieving revision 1.1.1.17 diff -b -u -r1.1.1.17 openpam_impl.h --- contrib/openpam/lib/openpam_impl.h 21 Dec 2007 11:49:29 -0000 1.1.1.17 +++ contrib/openpam/lib/openpam_impl.h 5 Feb 2009 15:41:19 -0000 @@ -110,13 +110,17 @@ int env_size; }; -#ifdef NGROUPS_MAX +#if _POSIX_VERSION > 199212 #define PAM_SAVED_CRED "pam_saved_cred" struct pam_saved_cred { uid_t euid; gid_t egid; - gid_t groups[NGROUPS_MAX]; int ngroups; + gid_t groups[]; + /* keep this last so that we can simply + .. over-allocate the amount of space + .. nb: don't use sizeof' unless you adjust + .. for the number of groups */ }; #endif Index: include/rpc/auth_unix.h =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/include/rpc/auth_unix.h,v retrieving revision 1.11 diff -b -u -r1.11 auth_unix.h --- include/rpc/auth_unix.h 23 Mar 2002 17:24:55 -0000 1.11 +++ include/rpc/auth_unix.h 14 Jan 2009 11:15:21 -0000 @@ -52,7 +52,7 @@ #define MAX_MACHINE_NAME 255 /* gids compose part of a credential; there may not be more than 16 of them */ -#define NGRPS 16 +#define AUTH_UNIX_NGROUPS 16 /* * Unix style credentials. Index: lib/libc/rpc/auth_unix.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/libc/rpc/auth_unix.c,v retrieving revision 1.18 diff -b -u -r1.18 auth_unix.c --- lib/libc/rpc/auth_unix.c 14 Jun 2007 20:07:35 -0000 1.18 +++ lib/libc/rpc/auth_unix.c 4 Feb 2009 15:31:57 -0000 @@ -182,27 +182,48 @@ * Returns an auth handle with parameters determined by doing lots of * syscalls. */ -AUTH * +AUTH* authunix_create_default() { - int len; char machname[MAXHOSTNAMELEN + 1]; + AUTH* auth_unix ; uid_t uid; gid_t gid; - gid_t gids[NGROUPS_MAX]; - - if (gethostname(machname, sizeof machname) == -1) - abort(); - machname[sizeof(machname) - 1] = 0; + gid_t *gids ; + uint ngroups ; + uint max_ngroups ; + +/* get hostname or fail */ + if( gethostname(machname,sizeof(machname)) == -1 ) + abort() ; + machname[sizeof(machname)-1] = 0 ; /* add a null terminator */ +/* set uid/gid from current effective values */ uid = geteuid(); gid = getegid(); - if ((len = getgroups(NGROUPS_MAX, gids)) < 0) - abort(); - if (len > NGRPS) - len = NGRPS; - /* XXX: interface problem; those should all have been unsigned */ - return (authunix_create(machname, (int)uid, (int)gid, len, - (int *)gids)); +/* set the group set */ +#if _POSIX_VERSION > 199212 + max_ngroups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + max_ngroups = NGROUPS_MAX ; +#else + max_ngroups = 16 ; +#endif + gids = (gid_t*)calloc( max_ngroups, sizeof(gid_t) ) ; + if( gids == NULL ) + abort () ; + if( (ngroups=getgroups(max_ngroups,gids)) < 0 ) { + free( gids ) ; + abort() ; + } +/* clip the groups to a transmissable size */ + if( ngroups > AUTH_UNIX_NGROUPS ) + ngroups = AUTH_UNIX_NGROUPS ; +/* XXX: interface problem; those should all have been unsigned */ + auth_unix = authunix_create( machname, + (int)uid, (int)gid, (int)ngroups, + (int*)gids ) ; + free( (void*)gids ) ; + return( auth_unix ) ; } /* Index: lib/libc/rpc/authunix_prot.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/libc/rpc/authunix_prot.c,v retrieving revision 1.10 diff -b -u -r1.10 authunix_prot.c --- lib/libc/rpc/authunix_prot.c 20 Nov 2007 01:51:20 -0000 1.10 +++ lib/libc/rpc/authunix_prot.c 4 Feb 2009 16:03:29 -0000 @@ -67,13 +67,14 @@ paup_gids = &p->aup_gids; - if (xdr_u_long(xdrs, &(p->aup_time)) - && xdr_string(xdrs, &(p->aup_machname), MAX_MACHINE_NAME) - && xdr_int(xdrs, &(p->aup_uid)) - && xdr_int(xdrs, &(p->aup_gid)) - && xdr_array(xdrs, (char **) paup_gids, - &(p->aup_len), NGRPS, sizeof(int), (xdrproc_t)xdr_int) ) { - return (TRUE); + if( xdr_u_long(xdrs,&(p->aup_time)) && + xdr_string(xdrs,&(p->aup_machname),MAX_MACHINE_NAME) && + xdr_int(xdrs,&(p->aup_uid)) && + xdr_int(xdrs,&(p->aup_gid)) && + xdr_array(xdrs,(char**)paup_gids,&(p->aup_len), + AUTH_UNIX_NGROUPS,sizeof(int),(xdrproc_t)xdr_int) ) + { + return( TRUE ) ; } - return (FALSE); + return( FALSE ) ; } Index: lib/libc/rpc/netname.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/libc/rpc/netname.c,v retrieving revision 1.8 diff -b -u -r1.8 netname.c --- lib/libc/rpc/netname.c 16 Oct 2004 06:11:35 -0000 1.8 +++ lib/libc/rpc/netname.c 14 Jan 2009 01:29:47 -0000 @@ -61,6 +61,7 @@ #ifndef MAXHOSTNAMELEN #define MAXHOSTNAMELEN 256 #endif + #ifndef NGROUPS #define NGROUPS 16 #endif Index: lib/libc/rpc/netnamer.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/libc/rpc/netnamer.c,v retrieving revision 1.12 diff -b -u -r1.12 netnamer.c --- lib/libc/rpc/netnamer.c 10 Mar 2005 00:58:21 -0000 1.12 +++ lib/libc/rpc/netnamer.c 3 Feb 2009 17:55:48 -0000 @@ -69,7 +69,6 @@ #ifndef NGROUPS #define NGROUPS 16 #endif - /* * Convert network-name into unix credential */ @@ -104,7 +103,7 @@ return (0); } *gidp = (gid_t) atol(p); - for (gidlen = 0; gidlen < NGROUPS; gidlen++) { + for (gidlen = 0; gidlen < _NGROUPS_RPC_MAX; gidlen++) { p = strsep(&res, "\n,"); if (p == NULL) break; @@ -157,7 +156,7 @@ static int _getgroups(uname, groups) char *uname; - gid_t groups[NGROUPS]; + gid_t groups[_NGROUPS_RPC_MAX]; { gid_t ngroups = 0; struct group *grp; @@ -169,10 +168,11 @@ while ((grp = getgrent())) { for (i = 0; grp->gr_mem[i]; i++) if (!strcmp(grp->gr_mem[i], uname)) { - if (ngroups == NGROUPS) { + if( ngroups == _NGROUPS_RPC_MAX ) { #ifdef DEBUG - fprintf(stderr, - "initgroups: %s is in too many groups\n", uname); + fprintf( stderr, + "initgroups: %s is in too many groups\n", + uname ) ; #endif goto toomany; } Index: lib/libc/rpc/svc_auth_des.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/libc/rpc/svc_auth_des.c,v retrieving revision 1.9 diff -b -u -r1.9 svc_auth_des.c --- lib/libc/rpc/svc_auth_des.c 22 Mar 2002 23:18:37 -0000 1.9 +++ lib/libc/rpc/svc_auth_des.c 3 Feb 2009 17:51:01 -0000 @@ -452,7 +452,7 @@ short uid; /* cached uid */ short gid; /* cached gid */ short grouplen; /* length of cached groups */ - short groups[NGROUPS]; /* cached groups */ + short groups[_NGROUPS_RPC_MAX]; /* cached groups */ }; /* Index: lib/libc/rpc/svc_auth_unix.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/libc/rpc/svc_auth_unix.c,v retrieving revision 1.11 diff -b -u -r1.11 svc_auth_unix.c --- lib/libc/rpc/svc_auth_unix.c 16 Oct 2004 06:11:35 -0000 1.11 +++ lib/libc/rpc/svc_auth_unix.c 4 Feb 2009 16:04:10 -0000 @@ -68,7 +68,7 @@ struct area { struct authunix_parms area_aup; char area_machname[MAX_MACHINE_NAME+1]; - int area_gids[NGRPS]; + int area_gids[AUTH_UNIX_NGROUPS] ; } *area; u_int auth_len; size_t str_len, gid_len; @@ -98,7 +98,7 @@ aup->aup_uid = (int)IXDR_GET_INT32(buf); aup->aup_gid = (int)IXDR_GET_INT32(buf); gid_len = (size_t)IXDR_GET_U_INT32(buf); - if (gid_len > NGRPS) { + if( gid_len > AUTH_UNIX_NGROUPS ) { stat = AUTH_BADCRED; goto done; } Index: lib/librpcsec_gss/svc_rpcsec_gss.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/lib/librpcsec_gss/svc_rpcsec_gss.c,v retrieving revision 1.4 diff -b -u -r1.4 svc_rpcsec_gss.c --- lib/librpcsec_gss/svc_rpcsec_gss.c 3 Nov 2008 10:38:00 -0000 1.4 +++ lib/librpcsec_gss/svc_rpcsec_gss.c 5 Feb 2009 16:09:37 -0000 @@ -127,7 +127,7 @@ rpc_gss_ucred_t cl_ucred; /* unix-style credentials */ bool_t cl_done_callback; /* TRUE after call */ void *cl_cookie; /* user cookie from callback */ - gid_t cl_gid_storage[NGRPS]; + gid_t cl_gid_storage[AUTH_UNIX_NGROUPS]; gss_OID cl_mech; /* mechanism */ gss_qop_t cl_qop; /* quality of protection */ u_int cl_seq; /* current sequence number */ @@ -578,7 +578,7 @@ getpwuid_r(uid, &pwd, buf, sizeof(buf), &pw); if (pw) { - int len = NGRPS; + int len = AUTH_UNIX_NGROUPS; uc->uid = pw->pw_uid; uc->gid = pw->pw_gid; uc->gidlist = client->cl_gid_storage; Index: sys/compat/svr4/svr4_misc.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/sys/compat/svr4/svr4_misc.c,v retrieving revision 1.101 diff -b -u -r1.101 svr4_misc.c --- sys/compat/svr4/svr4_misc.c 21 Apr 2008 21:24:08 -0000 1.101 +++ sys/compat/svr4/svr4_misc.c 14 Jan 2009 11:58:47 -0000 @@ -710,7 +710,12 @@ *retval = 0; break; case SVR4_CONFIG_NGROUPS: - *retval = NGROUPS_MAX; + *retval = _NGROUPS_COMPAT; + /* XXX: this should pull the value + * from sysctl but i cannot find + * the definitions for the similar + * varaibles here (i.e. 'maxproc') + */ break; case SVR4_CONFIG_CHILD_MAX: *retval = maxproc; Index: sys/fs/portalfs/portal.h =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/sys/fs/portalfs/portal.h,v retrieving revision 1.10 diff -b -u -r1.10 portal.h --- sys/fs/portalfs/portal.h 6 Jan 2005 18:10:40 -0000 1.10 +++ sys/fs/portalfs/portal.h 16 Jan 2009 23:44:50 -0000 @@ -43,7 +43,7 @@ int pcr_flag; /* File open mode */ uid_t pcr_uid; /* From ucred */ short pcr_ngroups; /* From ucred */ - gid_t pcr_groups[NGROUPS]; /* From ucred */ + gid_t pcr_groups[_NGROUPS_COMPAT]; /* From ucred */ }; #ifdef _KERNEL Index: sys/i386/ibcs2/ibcs2_misc.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/sys/i386/ibcs2/ibcs2_misc.c,v retrieving revision 1.70 diff -b -u -r1.70 ibcs2_misc.c --- sys/i386/ibcs2/ibcs2_misc.c 13 Jan 2008 14:44:07 -0000 1.70 +++ sys/i386/ibcs2/ibcs2_misc.c 14 Jan 2009 12:24:56 -0000 @@ -659,14 +659,14 @@ struct thread *td; struct ibcs2_getgroups_args *uap; { - ibcs2_gid_t iset[NGROUPS_MAX]; - gid_t gp[NGROUPS_MAX]; + ibcs2_gid_t iset[_NGROUPS_COMPAT]; + gid_t gp[_NGROUPS_COMPAT]; u_int i, ngrp; int error; if (uap->gidsetsize < 0) return (EINVAL); - ngrp = MIN(uap->gidsetsize, NGROUPS_MAX); + ngrp = MIN(uap->gidsetsize, _NGROUPS_COMPAT); error = kern_getgroups(td, &ngrp, gp); if (error) return (error); @@ -685,11 +685,11 @@ struct thread *td; struct ibcs2_setgroups_args *uap; { - ibcs2_gid_t iset[NGROUPS_MAX]; - gid_t gp[NGROUPS_MAX]; + ibcs2_gid_t iset[_NGROUPS_COMPAT]; + gid_t gp[_NGROUPS_COMPAT]; int error, i; - if (uap->gidsetsize < 0 || uap->gidsetsize > NGROUPS_MAX) + if (uap->gidsetsize < 0 || uap->gidsetsize > _NGROUPS_COMPAT) return (EINVAL); if (uap->gidsetsize && uap->gidset) { error = copyin(uap->gidset, iset, sizeof(ibcs2_gid_t) * @@ -789,8 +789,13 @@ return 0; case IBCS2_SC_NGROUPS_MAX: - mib[1] = KERN_NGROUPS; - break; + /* XXX: IBCS2 compat with group limits not known to + * me, so i'll just return a compatibile/safe limit + * for now */ + PROC_LOCK(p) ; + td->td_retval[0] = _NGROUPS_COMPAT ; + PROC_UNLOCK(p) ; + return( 0 ) ; case IBCS2_SC_OPEN_MAX: PROC_LOCK(p); Index: sys/kern/kern_mib.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/sys/kern/kern_mib.c,v retrieving revision 1.93 diff -b -u -r1.93 kern_mib.c --- sys/kern/kern_mib.c 28 Jan 2009 19:58:05 -0000 1.93 +++ sys/kern/kern_mib.c 4 Feb 2009 13:15:06 -0000 @@ -124,8 +124,8 @@ SYSCTL_INT(_kern, KERN_POSIX1, posix1version, CTLFLAG_RD, 0, _POSIX_VERSION, "Version of POSIX attempting to comply to"); -SYSCTL_INT(_kern, KERN_NGROUPS, ngroups, CTLFLAG_RD, - 0, NGROUPS_MAX, "Maximum number of groups a user can belong to"); +SYSCTL_INT(_kern, KERN_NGROUPS, ngroups, CTLFLAG_RW, + 0, _NGROUPS_COMPAT, "Maximum number of groups allocated to a user"); SYSCTL_INT(_kern, KERN_JOB_CONTROL, job_control, CTLFLAG_RD, 0, 1, "Whether job control is available"); Index: sys/sys/param.h =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/sys/sys/param.h,v retrieving revision 1.382 diff -b -u -r1.382 param.h --- sys/sys/param.h 28 Jan 2009 17:57:16 -0000 1.382 +++ sys/sys/param.h 4 Feb 2009 14:11:55 -0000 @@ -57,7 +57,7 @@ * is created, otherwise 1. */ #undef __FreeBSD_version -#define __FreeBSD_version 800062 /* Master, propagated to newvers */ +#define __FreeBSD_version 800060 /* Master, propagated to newvers */ #ifndef LOCORE #include @@ -77,7 +77,8 @@ #define MAXLOGNAME 17 /* max login name length (incl. NUL) */ #define MAXUPRC CHILD_MAX /* max simultaneous processes */ #define NCARGS ARG_MAX /* max bytes for an exec function */ -#define NGROUPS NGROUPS_MAX /* max number groups */ +#define NGROUPS _NGROUPS_COMPAT + /* depreciated check sysctl/sysconf for NGROUPS_MAX value instead */ #define NOFILE OPEN_MAX /* max open files per process */ #define NOGROUP 65535 /* marker for empty group set member */ #define MAXHOSTNAMELEN 256 /* max hostname size */ Index: sys/sys/syslimits.h =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/sys/sys/syslimits.h,v retrieving revision 1.23 diff -b -u -r1.23 syslimits.h --- sys/sys/syslimits.h 29 May 2007 15:14:46 -0000 1.23 +++ sys/sys/syslimits.h 3 Feb 2009 18:02:22 -0000 @@ -54,7 +54,6 @@ #define MAX_CANON 255 /* max bytes in term canon input line */ #define MAX_INPUT 255 /* max bytes in terminal input */ #define NAME_MAX 255 /* max bytes in a file name */ -#define NGROUPS_MAX 16 /* max supplemental group id's */ #ifndef OPEN_MAX #define OPEN_MAX 64 /* max open files per process */ #endif @@ -66,9 +65,35 @@ * We leave the following values undefined to force applications to either * assume conservative values or call sysconf() to get the current value. * - * HOST_NAME_MAX + * HOST_NAME_MAX NGROUPS_MAX * * (We should do this for most of the values currently defined here, * but many programs are not prepared to deal with this yet.) */ +/* + * here are some reference values in respect of the obsoleted + * NGROUPS_MAX value. + * nb: some apps appear to check NGROUPS_MAX as meaning that + * ... system has user groups (i.e. to #ifdef chunks of code). + * ... this is easy to change but maybe historically defined? + */ +#define _NGROUPS_RPC_MAX 16 /* reference only */ + /* nb: this is the old system max, so named + * ... because it's limit appears to + * ... have been derived from a limitation + * ... in RPC (and thereby NFS), where it's + * ... the max number of groups we can exchange */ +#define _NGROUPS_COMPAT _NGROUPS_RPC_MAX /* reference only */ + /* nb: although this is defined as equal to the rpc + * ... limit, i have defined it distintly so that + * ... we may distinguish (whilst updating) usage + * ... that is correctly explicit (i.e. should be 16) + * ... and usage that is only 16 because of an expected + * ... convention. hopefully we may remove these and + * ... define additional _NGROUPS_*_MAX for those defined + * ... uses. */ +#define _NGROUPS_SYS_MAX 65536 /* reference only */ + /* nb: the idea's to have this extensible + * ... indefinately, this is what linux have and + * ... should more than cover immediate needs */ #endif Index: usr.bin/catman/catman.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.bin/catman/catman.c,v retrieving revision 1.14 diff -b -u -r1.14 catman.c --- usr.bin/catman/catman.c 5 Dec 2005 14:22:12 -0000 1.14 +++ usr.bin/catman/catman.c 8 Feb 2009 22:51:44 -0000 @@ -93,8 +93,9 @@ enum Ziptype {NONE, BZIP, GZIP}; static uid_t uid; -static gid_t gids[NGROUPS_MAX]; +static gid_t *gids; static int ngids; +static int max_ngroups ; static int starting_dir; static char tmp_file[MAXPATHLEN]; struct stat test_st; @@ -789,7 +790,15 @@ /* NOTREACHED */ } } - ngids = getgroups(NGROUPS_MAX, gids); +/* allocate memory for group ids */ +#if _POSIX_VERSION > 199212 + max_ngroups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + max_ngroups = NGROUPS_MAX ; +#else + max_ngroups = _NGROUPS_COMPAT ; +#endif + ngids = getgroups( max_ngroups, gids ) ; if ((starting_dir = open(".", 0)) < 0) { err(1, "."); } Index: usr.bin/newgrp/newgrp.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.bin/newgrp/newgrp.c,v retrieving revision 1.2 diff -b -u -r1.2 newgrp.c --- usr.bin/newgrp/newgrp.c 30 Oct 2003 15:14:34 -0000 1.2 +++ usr.bin/newgrp/newgrp.c 9 Feb 2009 22:05:53 -0000 @@ -146,9 +146,10 @@ static void addgroup(const char *grpname) { - gid_t grps[NGROUPS_MAX]; + gid_t *grps; long lgid; - int dbmember, i, ngrps; + int dbmember, i, ngrps, max_ngroups ; + /* XXX: should 'max_ngroups' be a static const variable? */ gid_t egid; struct group *grp; char *ep, *pass; @@ -185,9 +186,21 @@ } } - if ((ngrps = getgroups(NGROUPS_MAX, (gid_t *)grps)) < 0) { +#if _POSIX_VERSION >= 199212 + max_ngroups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + max_ngroups = NGROUPS_MAX ; +#else + max_ngroups = _NGROUPS_COMPAT ; +#endif + grps = (gid_t*)calloc( max_ngroups, sizeof(gid_t) ) ; + if( grps == NULL ) { + warn( "group set memory allocation" ) ; + return ; + } + if( (ngrps=getgroups(max_ngroups,(gid_t*)grps)) < 0 ) { warn("getgroups"); - return; + goto error_free ; } /* Remove requested gid from supp. list if it exists. */ @@ -201,7 +214,7 @@ if (setgroups(ngrps, (const gid_t *)grps) < 0) { PRIV_END; warn("setgroups"); - return; + goto error_free ; } PRIV_END; } @@ -210,14 +223,14 @@ if (setgid(grp->gr_gid)) { PRIV_END; warn("setgid"); - return; + goto error_free ; } PRIV_END; grps[0] = grp->gr_gid; /* Add old effective gid to supp. list if it does not exist. */ if (egid != grp->gr_gid && !inarray(egid, grps, ngrps)) { - if (ngrps == NGROUPS_MAX) + if( ngrps == max_ngroups ) warnx("too many groups"); else { grps[ngrps++] = egid; @@ -225,12 +238,15 @@ if (setgroups(ngrps, (const gid_t *)grps)) { PRIV_END; warn("setgroups"); - return; + goto error_free ; } PRIV_END; } } +error_free: + free( grps ) ; + return ; } static int Index: usr.sbin/chown/chown.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.sbin/chown/chown.c,v retrieving revision 1.29 diff -b -u -r1.29 chown.c --- usr.sbin/chown/chown.c 7 Aug 2004 04:19:37 -0000 1.29 +++ usr.sbin/chown/chown.c 8 Feb 2009 16:22:31 -0000 @@ -269,7 +269,8 @@ { static uid_t euid = -1; static int ngroups = -1; - gid_t groups[NGROUPS_MAX]; + static int max_groups ; + gid_t *groups; /* Check for chown without being root. */ if (errno != EPERM || (uid != (uid_t)-1 && @@ -279,16 +280,31 @@ } /* Check group membership; kernel just returns EPERM. */ +#if _POSIX_VERSION >= 199212 + max_groups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + max_groups = NGROUPS_MAX ; +#else + max_groups = _NGROUPS_COMPAT ; +#endif + groups = (gid_t*)calloc( max_groups, sizeof(gid_t) ) ; + if( groups == NULL ) { + warnx( "failed to allocate memory for group set" ) ; + goto exit_cleanup ; + } if (gid != (gid_t)-1 && ngroups == -1 && euid == (uid_t)-1 && (euid = geteuid()) != 0) { - ngroups = getgroups(NGROUPS_MAX, groups); + ngroups = getgroups( max_groups, groups ) ; while (--ngroups >= 0 && gid != groups[ngroups]); if (ngroups < 0) { warnx("you are not a member of group %s", gname); - return; + goto exit_cleanup ; } } warn("%s", file); +exit_cleanup: + free( groups ) ; + return ; } void Index: usr.sbin/chroot/chroot.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.sbin/chroot/chroot.c,v retrieving revision 1.11 diff -b -u -r1.11 chroot.c --- usr.sbin/chroot/chroot.c 7 Aug 2004 04:19:37 -0000 1.11 +++ usr.sbin/chroot/chroot.c 5 Feb 2009 23:29:48 -0000 @@ -59,6 +59,7 @@ char *user; /* user to switch to before running program */ char *group; /* group to switch to ... */ char *grouplist; /* group list to switch to ... */ +int max_ngroups; /* max number of groups allowable */ int main(argc, argv) @@ -69,12 +70,25 @@ struct passwd *pw; char *endp, *p; const char *shell; - gid_t gid, gidlist[NGROUPS_MAX]; + gid_t gid, *gidlist ; uid_t uid; - int ch, gids; + int ch, gids ; +/* set some defaults */ gid = 0; uid = 0; + user = NULL ; + group = NULL ; + grouplist = NULL ; +#if _POSIX_VERSION >= 199212 + max_ngroups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + max_ngroups = NGROUPS_MAX ; +#else + max_ngroups = _NGROUPS_COMPAT ; +#endif + +/* process command line options */ while ((ch = getopt(argc, argv, "G:g:u:")) != -1) { switch(ch) { case 'u': @@ -103,9 +117,12 @@ if (argc < 1) usage(); +/* if a group argument was passed then process it */ if (group != NULL) { + /* if the first char's a digit then assume it's a gid ... */ if (isdigit((unsigned char)*group)) { gid = (gid_t)strtoul(group, &endp, 0); + /* ... and back out that assumption if it proves wrong */ if (*endp != '\0') goto getgroup; } else { @@ -117,8 +134,15 @@ } } - for (gids = 0; - (p = strsep(&grouplist, ",")) != NULL && gids < NGROUPS_MAX; ) { +/* process command line group list */ + if( grouplist != NULL ) { + gidlist = (gid_t*)calloc( max_ngroups, sizeof(gid_t) ) ; + if( gidlist == NULL ) + errx( 1, "inadquate memory for group list" ) ; + for( gids = 0 ; + gids < max_ngroups && + (p=strsep(&grouplist,",")) != NULL ; ) + { if (*p == '\0') continue; @@ -135,9 +159,11 @@ } gids++; } - if (p != NULL && gids == NGROUPS_MAX) + if( p != NULL && gids == max_ngroups ) errx(1, "too many supplementary groups provided"); + } +/* set user from command line option, if supplied */ if (user != NULL) { if (isdigit((unsigned char)*user)) { uid = (uid_t)strtoul(user, &endp, 0); @@ -152,9 +178,11 @@ } } +/* change root */ if (chdir(argv[0]) == -1 || chroot(".") == -1) err(1, "%s", argv[0]); +/* set credentials */ if (gids && setgroups(gids, gidlist) == -1) err(1, "setgroups"); if (group && setgid(gid) == -1) @@ -162,11 +190,14 @@ if (user && setuid(uid) == -1) err(1, "setuid"); +/* exec the remaining arguments as the chroot'd command ... */ if (argv[1]) { execvp(argv[1], &argv[1]); err(1, "%s", argv[1]); + /* NOTREACHED */ } +/* ... or execute the default system shell */ if (!(shell = getenv("SHELL"))) shell = _PATH_BSHELL; execlp(shell, shell, "-i", (char *)NULL); Index: usr.sbin/gssd/gssd.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.sbin/gssd/gssd.c,v retrieving revision 1.1 diff -b -u -r1.1 gssd.c --- usr.sbin/gssd/gssd.c 3 Nov 2008 10:38:00 -0000 1.1 +++ usr.sbin/gssd/gssd.c 5 Feb 2009 16:16:37 -0000 @@ -464,8 +464,8 @@ result->uid = uid; getpwuid_r(uid, &pwd, buf, sizeof(buf), &pw); if (pw) { - int len = NGRPS; - int groups[NGRPS]; + int len = AUTH_UNIX_NGROUPS ; + int groups[AUTH_UNIX_NGROUPS] ; result->gid = pw->pw_gid; getgrouplist(pw->pw_name, pw->pw_gid, groups, &len); Index: usr.sbin/mount_portalfs/cred.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.sbin/mount_portalfs/cred.c,v retrieving revision 1.1 diff -b -u -r1.1 cred.c --- usr.sbin/mount_portalfs/cred.c 11 Mar 2005 08:39:58 -0000 1.1 +++ usr.sbin/mount_portalfs/cred.c 16 Jan 2009 23:49:36 -0000 @@ -46,7 +46,7 @@ set_user_credentials(struct portal_cred *user, struct portal_cred *save) { save->pcr_uid = geteuid(); - if ((save->pcr_ngroups = getgroups(NGROUPS_MAX, save->pcr_groups)) < 0) + if( (save->pcr_ngroups=getgroups(_NGROUPS_COMPAT,save->pcr_groups)) < 0 ) return (-1); if (setgroups(user->pcr_ngroups, user->pcr_groups) < 0) return (-1); Index: usr.sbin/pppd/options.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.sbin/pppd/options.c,v retrieving revision 1.26 diff -b -u -r1.26 options.c --- usr.sbin/pppd/options.c 7 Nov 2007 10:53:38 -0000 1.26 +++ usr.sbin/pppd/options.c 10 Feb 2009 09:11:47 -0000 @@ -72,10 +72,6 @@ char *strdup(char *); #endif -#ifndef GIDSET_TYPE -#define GIDSET_TYPE gid_t -#endif - /* * Option variables and default values. */ @@ -779,23 +775,64 @@ int fd; { uid_t uid; - int ngroups, i; + int ngroups, max_ngroups, i; struct stat sbuf; - GIDSET_TYPE groups[NGROUPS_MAX]; + gid_t *groups; +/* get the uid */ uid = getuid(); +/* ... and return true if root */ +/* XXX: needs credential check */ if (uid == 0) return 1; + +/* if we're not root, get some info about the file */ if (fstat(fd, &sbuf) != 0) return 0; + +/* test for owner match with current process */ if (sbuf.st_uid == uid) return sbuf.st_mode & S_IRUSR; +/* ... and a group match */ if (sbuf.st_gid == getgid()) return sbuf.st_mode & S_IRGRP; - ngroups = getgroups(NGROUPS_MAX, groups); - for (i = 0; i < ngroups; ++i) - if (sbuf.st_gid == groups[i]) - return sbuf.st_mode & S_IRGRP; + +/* if we've still no luck then check the group list for permission match */ +#if _POSIX_VERSION >= 199212 + max_ngroups = sysconf( _SC_NGROUPS_MAX ) ; +#elif defined(NGROUPS_MAX) + max_ngroups = NGROUPS_MAX ; +#else + max_ngroups = _NGROUPS_COMPAT ; +#endif + groups = (gid_t*) calloc( max_ngroups, sizeof(gid_t) ) ; + if( groups == NULL ) { + /* if we cannot check groups correctly then assume 'fd' is unreadable + * XXX: this may be false as the converse is more likely. + * i.e. it would be failed readable on available groups + * and granted on full list, however, we just can't be + * psychic and i'm not about to code some idiotic loop that tries + * to get 'some' memory for partial testing. probably a better + * recourse would be to simply die here but that seems severe + * for a 'readable' test. + * NB: we don't need a 'full' allocation of memory to test the + * group list, only to store it. one idea would be to do this in + * 'blocks' + */ + option_error( 1, "unable to allocate memory for group list" ) ; + return( 0 ) ; + } +/* get groups */ + ngroups = getgroups( max_ngroups, groups ) ; +/* ... and test the group permission if matching */ + for( i = 0 ; i < ngroups ; ++i ) { + if (sbuf.st_gid == groups[i]) { + free( (void*)groups) ; + return( sbuf.st_mode & S_IRGRP ) ; + } + } +/* otherwise return other permissions match */ + free( (void*)groups ) ; return sbuf.st_mode & S_IROTH; } Index: usr.sbin/rpc.lockd/kern.c =================================================================== RCS file: /home/__orole/dev/cabinet/zeeNi/ai/freebsd/src/usr.sbin/rpc.lockd/kern.c,v retrieving revision 1.21 diff -b -u -r1.21 kern.c --- usr.sbin/rpc.lockd/kern.c 17 Aug 2006 05:55:20 -0000 1.21 +++ usr.sbin/rpc.lockd/kern.c 5 Feb 2009 16:22:17 -0000 @@ -239,15 +239,15 @@ int ngroups; ngroups = xucred->cr_ngroups - 1; - if (ngroups > NGRPS) - ngroups = NGRPS; - if (cl->cl_auth != NULL) - cl->cl_auth->ah_ops->ah_destroy(cl->cl_auth); - cl->cl_auth = authunix_create(hostname, + if( ngroups > AUTH_UNIX_NGROUPS ) + ngroups = AUTH_UNIX_NGROUPS ; + if( cl->cl_auth != NULL ) + cl->cl_auth->ah_ops->ah_destroy( cl->cl_auth ) ; + cl->cl_auth = authunix_create( hostname, xucred->cr_uid, xucred->cr_groups[0], ngroups, - &xucred->cr_groups[1]); + &xucred->cr_groups[1] ) ; } From samflanker at gmail.com Tue Mar 24 05:46:47 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Tue Mar 24 05:46:55 2009 Subject: [problem] aac0 does not respond In-Reply-To: <49C8AD9B.7000500@gmail.com> References: <49C8AD9B.7000500@gmail.com> Message-ID: <49C8D63E.4050107@gmail.com> Vladimir Ermakov wrote: > Hello, All > > Describe my problem: > have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 > 2 HHD of 6 have errors in smart data (damaged) > i am try read file /var/db/mysql/ibdata1 from this volume > system does not respond ( lost access to ssh ) after read 6GB data > from this file > and print debug messages on ttyv0 > > As to prevent the emergence of this problem? > As monitor the status of RAID-controller? > similar problem http://lists.freebsd.org/pipermail/freebsd-scsi/2008-June/003524.html /Vladimir Ermakov From samflanker at gmail.com Tue Mar 24 05:48:37 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Tue Mar 24 05:48:45 2009 Subject: [problem] aac0 does not respond In-Reply-To: <49C8B5E3.2000104@gmail.com> References: <49C8AD9B.7000500@gmail.com> <3bbf2fe10903240324t6616cc9dx6ae28028ac971be6@mail.gmail.com> <49C8B5E3.2000104@gmail.com> Message-ID: <49C8D6AD.7050501@gmail.com> Attilio Rao wrote: > 2009/3/24 Vladimir Ermakov : > >> Hello, All >> >> Describe my problem: >> have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 >> 2 HHD of 6 have errors in smart data (damaged) >> i am try read file /var/db/mysql/ibdata1 from this volume >> system does not respond ( lost access to ssh ) after read 6GB data >> from this >> file >> and print debug messages on ttyv0 >> >> As to prevent the emergence of this problem? >> As monitor the status of RAID-controller? >> >> please, any solutions >> > > Is this -STABLE or -CURRENT? > And if it is -CURRENT, what revision? > > Thanks, > Attilio > > > it -STABLE thx /Vladimir Ermakov From pluknet at gmail.com Tue Mar 24 06:22:42 2009 From: pluknet at gmail.com (pluknet) Date: Tue Mar 24 06:22:49 2009 Subject: [problem] aac0 does not respond In-Reply-To: <49C8AD9B.7000500@gmail.com> References: <49C8AD9B.7000500@gmail.com> Message-ID: 2009/3/24 Vladimir Ermakov : > Hello, All > > Describe my problem: > have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 > 2 HHD of 6 ?have errors in smart data (damaged) > i am try read file /var/db/mysql/ibdata1 from this volume > system does not respond ( lost access to ssh ) after read 6GB data from this > file > and print debug messages on ttyv0 > > As to prevent the emergence of this problem? > As monitor the status of RAID-controller? > You can check status of aac controller with arcconf utility and post results there.. -- wbr, pluknet From samflanker at gmail.com Tue Mar 24 07:25:17 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Tue Mar 24 07:25:24 2009 Subject: [problem] aac0 does not respond In-Reply-To: References: <49C8AD9B.7000500@gmail.com> Message-ID: <49C8ED57.9080807@gmail.com> pluknet wrote: > 2009/3/24 Vladimir Ermakov : > >> Hello, All >> >> Describe my problem: >> have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 >> > > You can check status of aac controller with arcconf utility > > and post results there.. > > OK i am try enter command: # arcconf task start 1 device 0 4 verify after viewed logs (note, from kernel config: options AAC_DEBUG=2) # less /var/log/messages Mar 24 14:43:24 sys3 kernel: aac0: JobProgress (97) - running (9500000, 10000000) Mar 24 14:43:24 sys3 kernel: aac0: (ScsiVerify) handle 4 Mar 24 14:44:10 sys3 kernel: aac0: JobProgress (98) - running (9600000, 10000000) Mar 24 14:44:10 sys3 kernel: aac0: (ScsiVerify) handle 4 Mar 24 14:44:55 sys3 kernel: aac0: JobProgress (99) - running (9700000, 10000000) Mar 24 14:44:55 sys3 kernel: Mar 24 14:44:55 sys3 kernel: aac0: (ScsiVerify) handle 4 Mar 24 14:45:41 sys3 kernel: aac0: JobProgress (100) - running (9800000, 10000000) Mar 24 14:45:41 sys3 kernel: aac0: (ScsiVerify) handle 4 Mar 24 14:46:28 sys3 kernel: aac0: JobProgress (101) - running (9900000, 10000000) Mar 24 14:46:28 sys3 kernel: aac0: (ScsiVerify) handle 4 Mar 24 14:47:15 sys3 kernel: aac0: EventNotify(0) Mar 24 14:47:15 sys3 kernel: aac0: (23) Mar 24 14:47:15 sys3 kernel: aac0: JobProgress (102) - success (9900000, 10000000) Mar 24 14:47:15 sys3 kernel: aac0: (ScsiVerify) handle 4 # arcconf getlogs 1 EVENT *** References: <2e566b9e0901070005s630c2212k44a0e59a1bcf69aa@mail.gmail.com> <49710E4F.6020404@delphij.net> <2e566b9e0903232328y45801f76lc6d64acb4fef3dc@mail.gmail.com> Message-ID: <49C90791.7040807@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Shaowei Wang (wsw) wrote: > Hi, delphij > > The problem about FreeBSD-7.x-amd64's hptiop driver is solved by > patching our RAID-manage software (userland utils). > > The hptrr driver is a soft RAID so a 32-bit compatibility ioctl > structure is necessary. The hptiop is a hardware RAID controller, the > firmware is 32-bit. So do we need to patch the driver at our side? My reading is that we will not need it anymore? Please feel free to let me know if you want the patch be committed. Since we are going to have 7.2-RELEASE by early May, it's important to merge stuff back early so they get more through tests, etc. > I'm not so familiar with FreeBSD's development community. I'm sorry > Posting the infomation here. Never mind, the PR system is just a more convenient way of tracking issues (i.e. you can check back if a problem has been resolved at a later time, etc.). > On Sat, Jan 17, 2009 at 6:46 AM, Xin LI > wrote: > > Hi, Shaowei, > > It seems that I can not apply your patch directly, I have tried to do it > manually, as attached, please let me know if it's Ok. I can commit for > you against -HEAD if it looks fine and take care for MFC. > > Note that, however, I am more or less concerned about the driver if > 32-bit utility is running on amd64 platform. There seems to have three > pointer style field in hpt_iop_ioctl_param. I have checked hptrr(4) and > found that it has defined a 32-bit compatibility ioctl structure. > According to my understanding to hptiop(4), this could be a problem. > > PS. For faster handling it is probably a good idea to submit patch > through our PR system: http://www.freebsd.org/send-pr.html > > Shaowei Wang (wsw) wrote: >> Hi, guys > >> hptiop driver in the 7.1 release has a little bug. >> Because this issue the Raid-manage GUI program which we provided > can NOT >> work anymore. > >> So we give the patch: > >> Index: hptiop.h >> =================================================================== >> --- hptiop.h (revision 186851) >> +++ hptiop.h (working copy) >> @@ -260,7 +260,7 @@ >> unsigned long lpOutBuffer; /* output data buffer */ >> u_int32_t nOutBufferSize; /* size of output > data buffer >> */ >> unsigned long lpBytesReturned; /* count of HPT_U8s > returned */ >> -}; >> +}__attribute__((packed)); > >> #define HPT_IOCTL_FLAG_OPEN 1 >> #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > >> ==================================================================== > >> -wsw > > > /************************************************************************/ > >> '?} > >> hptiop?q?(7.1?LH- *? >> ?*????????5? ????L > >> ????e > >> Index: hptiop.h >> =================================================================== >> --- hptiop.h (revision 186851) >> +++ hptiop.h (working copy) >> @@ -260,7 +260,7 @@ >> unsigned long lpOutBuffer; /* output data buffer */ >> u_int32_t nOutBufferSize; /* size of output > data buffer >> */ >> unsigned long lpBytesReturned; /* count of HPT_U8s > returned */ >> -}; >> +}__attribute__((packed)); > >> #define HPT_IOCTL_FLAG_OPEN 1 >> #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > >> ==================================================================== > >> -wsw > > > > ------------------------------------------------------------------------ > >> _______________________________________________ >> freebsd-hackers@freebsd.org > mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to > "freebsd-hackers-unsubscribe@freebsd.org > " > > Index: sys/dev/hptiop/hptiop.h =================================================================== - --- sys/dev/hptiop/hptiop.h ??? 187338? +++ sys/dev/hptiop/hptiop.h ?????? @@ -260,7 +260,7 @@ unsigned long lpOutBuffer; /* output data buffer */ u_int32_t nOutBufferSize; /* size of output data buffer */ unsigned long lpBytesReturned; /* count of HPT_U8s returned */ - -}; +} __attribute__((packed)); #define HPT_IOCTL_FLAG_OPEN 1 #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAknJB5EACgkQi+vbBBjt66CPRwCeLna7weWqMVK8G/MPFcpIR5Xb z3QAn39CaWIMqTUBmj/EnAc9i09byweF =ylVm -----END PGP SIGNATURE----- From wsw1wsw2 at gmail.com Tue Mar 24 17:45:25 2009 From: wsw1wsw2 at gmail.com (Shaowei Wang (wsw)) Date: Tue Mar 24 17:45:32 2009 Subject: A patch of HPTIOP driver for 7.1-RELEASE In-Reply-To: <49C90791.7040807@delphij.net> References: <2e566b9e0901070005s630c2212k44a0e59a1bcf69aa@mail.gmail.com> <49710E4F.6020404@delphij.net> <2e566b9e0903232328y45801f76lc6d64acb4fef3dc@mail.gmail.com> <49C90791.7040807@delphij.net> Message-ID: <2e566b9e0903241745p6dc9ba4bq38a555b3896c23fb@mail.gmail.com> On Wed, Mar 25, 2009 at 12:17 AM, Xin LI wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > Shaowei Wang (wsw) wrote: > > Hi, delphij > > > > The problem about FreeBSD-7.x-amd64's hptiop driver is solved by > > patching our RAID-manage software (userland utils). > > > > The hptrr driver is a soft RAID so a 32-bit compatibility ioctl > > structure is necessary. The hptiop is a hardware RAID controller, the > > firmware is 32-bit. > > So do we need to patch the driver at our side? My reading is that we > will not need it anymore? Please feel free to let me know if you want > the patch be committed. Since we are going to have 7.2-RELEASE by early > May, it's important to merge stuff back early so they get more through > tests, etc. > Yes, this patch should be committed when we going to have the next FreeBSD release. Thanks! > > I'm not so familiar with FreeBSD's development community. I'm sorry > > Posting the infomation here. > > Never mind, the PR system is just a more convenient way of tracking > issues (i.e. you can check back if a problem has been resolved at a > later time, etc.). I'll try to use the PR system next time and thank you again. > > > On Sat, Jan 17, 2009 at 6:46 AM, Xin LI > > wrote: > > > > Hi, Shaowei, > > > > It seems that I can not apply your patch directly, I have tried to do it > > manually, as attached, please let me know if it's Ok. I can commit for > > you against -HEAD if it looks fine and take care for MFC. > > > > Note that, however, I am more or less concerned about the driver if > > 32-bit utility is running on amd64 platform. There seems to have three > > pointer style field in hpt_iop_ioctl_param. I have checked hptrr(4) and > > found that it has defined a 32-bit compatibility ioctl structure. > > According to my understanding to hptiop(4), this could be a problem. > > > > PS. For faster handling it is probably a good idea to submit patch > > through our PR system: http://www.freebsd.org/send-pr.html > > > > Shaowei Wang (wsw) wrote: > >> Hi, guys > > > >> hptiop driver in the 7.1 release has a little bug. > >> Because this issue the Raid-manage GUI program which we provided > > can NOT > >> work anymore. > > > >> So we give the patch: > > > >> Index: hptiop.h > >> =================================================================== > >> --- hptiop.h (revision 186851) > >> +++ hptiop.h (working copy) > >> @@ -260,7 +260,7 @@ > >> unsigned long lpOutBuffer; /* output data buffer */ > >> u_int32_t nOutBufferSize; /* size of output > > data buffer > >> */ > >> unsigned long lpBytesReturned; /* count of HPT_U8s > > returned */ > >> -}; > >> +}__attribute__((packed)); > > > >> #define HPT_IOCTL_FLAG_OPEN 1 > >> #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > > > >> ==================================================================== > > > >> -wsw > > > > > > > /************************************************************************/ > > > >> '?} > > > >> hptiop?q?(7.1?LH- * ? > >> ?* ???? ????5 ? ???L > > > >> ????e > > > >> Index: hptiop.h > >> =================================================================== > >> --- hptiop.h (revision 186851) > >> +++ hptiop.h (working copy) > >> @@ -260,7 +260,7 @@ > >> unsigned long lpOutBuffer; /* output data buffer */ > >> u_int32_t nOutBufferSize; /* size of output > > data buffer > >> */ > >> unsigned long lpBytesReturned; /* count of HPT_U8s > > returned */ > >> -}; > >> +}__attribute__((packed)); > > > >> #define HPT_IOCTL_FLAG_OPEN 1 > >> #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > > > >> ==================================================================== > > > >> -wsw > > > > > > > > ------------------------------------------------------------------------ > > > >> _______________________________________________ > >> freebsd-hackers@freebsd.org > > mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > >> To unsubscribe, send any mail to > > "freebsd-hackers-unsubscribe@freebsd.org > > " > > > > > > Index: sys/dev/hptiop/hptiop.h > =================================================================== > - --- sys/dev/hptiop/hptiop.h ??? 187338? > +++ sys/dev/hptiop/hptiop.h ?????? > @@ -260,7 +260,7 @@ > unsigned long lpOutBuffer; /* output data buffer */ > u_int32_t nOutBufferSize; /* size of output > data buffer */ > unsigned long lpBytesReturned; /* count of HPT_U8s > returned */ > - -}; > +} __attribute__((packed)); > > #define HPT_IOCTL_FLAG_OPEN 1 > #define HPT_CTL_CODE_BSD_TO_IOP(x) ((x)-0xff00) > > > > > - -- > Xin LI http://www.delphij.net/ > FreeBSD - The Power to Serve! > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.11 (FreeBSD) > > iEYEARECAAYFAknJB5EACgkQi+vbBBjt66CPRwCeLna7weWqMVK8G/MPFcpIR5Xb > z3QAn39CaWIMqTUBmj/EnAc9i09byweF > =ylVm > -----END PGP SIGNATURE----- > From ota at j.email.ne.jp Tue Mar 24 21:12:12 2009 From: ota at j.email.ne.jp (Yoshihiro Ota) Date: Tue Mar 24 21:12:24 2009 Subject: 2 uni-directional TCP connection good? In-Reply-To: <49C35A58.2030607@prgmr.com> References: <20090320045319.04484fc5.ota@j.email.ne.jp> <49C35A58.2030607@prgmr.com> Message-ID: <20090325001205.aea0f0d1.ota@j.email.ne.jp> On Fri, 20 Mar 2009 01:56:56 -0700 Michael David Crawford wrote: > Yoshihiro Ota wrote: > > > I saw a program that opens 2 TCP connections. > > One connection is only used for server to client messaging only > > and the other connection is used only for client to server messaging. > > > > 2. He also said that it would also waste network bandwidth. > > You have a two-way communication no matter what you do. But if you > don't actually use inbound direction, all it gets used for is the > receipt of ACK packets. > > That is, the inbound connection is used to make the data transfer reliable. > > If you don't have any payload data on the inbound connection, then the > outbound connection won't have any ACK packets. > > If you're sending payload data, the ACK info can "hitchhike" along with > the payload packets, thus saving bandwidth. But if you're not sending > any payload data at all, there will be packets transmitted which contain > the ACKs and nothing else. > > The extra network overhead will be modest if you're sending a lot of > data all at once, say transferring a large file. But if very little > data is sent per packet, say individual characters in a telnet > connection, the overhead would be very high. So far until this, this was what I had though and learned about TCP connections. > If you have a single connection with payload data in both directions, > then the ACKs will almost always ride along with some payload data. The > only time a packet will contain nothing but an ACK will be when some > data was transmitted, but none is to be received at the time. > > Mike However, I had forgotten this case. This can explain he even said that using 2 TCP connection would cut the bandwidth into half. This sounds like the case he was referring to. Thanks, Hiro From ota at j.email.ne.jp Tue Mar 24 21:59:19 2009 From: ota at j.email.ne.jp (Yoshihiro Ota) Date: Tue Mar 24 21:59:26 2009 Subject: 2 uni-directional TCP connection good In-Reply-To: References: <20090320045319.04484fc5.ota@j.email.ne.jp> <20090322235253.432874dd.ota@j.email.ne.jp> Message-ID: <20090325005911.87aa63ab.ota@j.email.ne.jp> On Mon, 23 Mar 2009 08:20:20 +0000 (GMT) Robert Watson wrote: > > On Sun, 22 Mar 2009, Yoshihiro Ota wrote: > > >> On Fri, 20 Mar 2009, Yoshihiro Ota wrote: > >> > >>> 1. With TCP connections, only sender side can detect some communication > >>> issues passively if happened. By using two connections, you lost that > >>> ability by your self. I agree on this one. > >> > >> Could you expand a bit on this point? While the connection creation > >> process (usually) asymmetric, once the connection is built it's essentially > >> the same state machine on both sides of the connection, and socket > >> semantics with respect to the state machine are effectively identical. > >> Application on both sides should be able to detect disconnect, monitor > >> connection state using TCP_INFO, etc. > > > > What I meant was that there were cases when a receiver could not tell > > weather no data was coming or communication was interrupted. Once > > connection is established, a route is available between a server and a > > client. Let's say this route is broken for some reasons, i.e. someone > > unplugged a cable or a firewall started dropping or rejecting between these > > server and client, a sender may not notice as soon as it happens but at > > least, a sender knows a massages was not delivered right. On the other > > hand, receiver side does not have any idea that a message delivery failure > > has happened at all or for a while unless using heartbeat messages in upper > > layer. KEEP_ALIVE option seems to be implementation dependent such that you > > cannot assure TCP connection availability for every minute. > > This is generally considered a robustness property rather than a fragility > issue, but yes: if you need a liveliness property for idle connections with > TCP, it's something you have to implement at the application layer, and many > protocols indeed do this. I don't see that this is an argument for using two > TCP connections as opposed to one, however. If you're interested in > alternative protocols, however, SCTP allows a number of these protocol > behaviors to be modified, and includes support for a heartbeat. Actually, the programs I had problems with were ggated/ggatec. As I look back my records, I found a problem with ggate when I had a slow ggated server (Celeron 400MHz) and a fast ggatec client (AMD Turion(tm) 64 X2 1.9GHz). While the client was writing a large file to the server, there were often major delays in its communication. While this was happening, I observed with "systat -vm" and saw interrupts on the NIC was 1 per a couple seconds. Indeed, I tested SCTP with ggate but as far as I remember, STCP seemed to have problems such that ggate couldn't establish a connection back in 7.0-RELEASE. I tested the same code in 7.1-RELEASE just before and it appeared to be working. It was just an experiment back in a year ago and couldn't find a reason easily. So, I gave up. Recently, ggate became necessity. I enhanced some of debug capabilities and found that when this happens, sequence numbers in ggate are out of sync. For example, while this is happening, ggatec reports it has finished send() with seq# 186, but at the same exact time, ggated reports it has only recv()ed up to seq# 184. They get synced eventually, but it takes long time and happens frequently. Meanwhile, terminal does not response in timely manner. This hiccup doesn't happen if client is doing read-access to ggated. FTP or nfs doesn't have similar hiccup and communicate at its max speed, 100-Base T. Both ggatec and ggated creates 2 TCP connections and 2 threads for blocking reads and writes. When I mentioned this problem to the friend mentioned above, he suggested to change 2 uni-directional TCP connections to 1 bi-directional one. In fact, this fixed the problem. At the same time, he also expressed that 2 uni-directional TCP connection was so bad no one should not use like it must be prohibited at all. So, I wondered and wanted to know how other people would say about it. If using 2 uni-directional TCP communication is not that bad idea or something prohibited, then we may have some issues in TCP layer. If anyone is interested in looking into, I can elaborate other observations as well. Thanks, Hiro From rizzo at iet.unipi.it Wed Mar 25 01:58:12 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Wed Mar 25 01:58:26 2009 Subject: does Copyright on source files expire ? Message-ID: <20090325084722.GC98685@onelab2.iet.unipi.it> Someone just asked me permission to move to a 3-clause BSD copyright some piece of software that I haven't touched in 10+ years. I said yes, but then I was wondering what happens if the person listed is not responding or not reachable anymore: does copyright on source code expire, and if so, when ? (I suppose it is related to either the date listed on the copyright, or to the date of some remarkable event for the author). cheers luigi From das at FreeBSD.ORG Wed Mar 25 02:29:24 2009 From: das at FreeBSD.ORG (David Schultz) Date: Wed Mar 25 02:29:32 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325084722.GC98685@onelab2.iet.unipi.it> References: <20090325084722.GC98685@onelab2.iet.unipi.it> Message-ID: <20090325093152.GB85469@zim.MIT.EDU> On Wed, Mar 25, 2009, Luigi Rizzo wrote: > Someone just asked me permission to move to a 3-clause BSD > copyright some piece of software that I haven't touched in 10+ years. > > I said yes, but then I was wondering what happens if the > person listed is not responding or not reachable anymore: > does copyright on source code expire, and if so, when ? > (I suppose it is related to either the date listed on the copyright, > or to the date of some remarkable event for the author). In the US, the rule that applies most of the time is that Copyright expires 70 years after the author dies, although there are many special cases where the term differs. A person's Copyright doesn't go away just because they die, disappear, or fail to respond. If you can't contact them, their heirs, or whomever they transferred the Copyright to, you're stuck. From ertr1013 at student.uu.se Wed Mar 25 02:29:59 2009 From: ertr1013 at student.uu.se (Erik Trulsson) Date: Wed Mar 25 02:30:05 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325084722.GC98685@onelab2.iet.unipi.it> References: <20090325084722.GC98685@onelab2.iet.unipi.it> Message-ID: <20090325091442.GA13455@owl.midgard.homeip.net> On Wed, Mar 25, 2009 at 09:47:22AM +0100, Luigi Rizzo wrote: > Someone just asked me permission to move to a 3-clause BSD > copyright some piece of software that I haven't touched in 10+ years. > > I said yes, but then I was wondering what happens if the > person listed is not responding or not reachable anymore: > does copyright on source code expire, and if so, when ? Yes, it will expire eventually, after a long time. The exact length of copyright can vary a bit between different countries, but in most places nowadays I think it is 'life of author plus 70 years' > (I suppose it is related to either the date listed on the copyright, > or to the date of some remarkable event for the author). > -- Erik Trulsson ertr1013@student.uu.se From rizzo at iet.unipi.it Wed Mar 25 02:36:05 2009 From: rizzo at iet.unipi.it (Luigi Rizzo) Date: Wed Mar 25 02:36:11 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325093152.GB85469@zim.MIT.EDU> References: <20090325084722.GC98685@onelab2.iet.unipi.it> <20090325093152.GB85469@zim.MIT.EDU> Message-ID: <20090325094100.GA915@onelab2.iet.unipi.it> On Wed, Mar 25, 2009 at 05:31:52AM -0400, David Schultz wrote: > On Wed, Mar 25, 2009, Luigi Rizzo wrote: > > Someone just asked me permission to move to a 3-clause BSD > > copyright some piece of software that I haven't touched in 10+ years. > > > > I said yes, but then I was wondering what happens if the > > person listed is not responding or not reachable anymore: > > does copyright on source code expire, and if so, when ? > > (I suppose it is related to either the date listed on the copyright, > > or to the date of some remarkable event for the author). > > In the US, the rule that applies most of the time is that > Copyright expires 70 years after the author dies, although there > are many special cases where the term differs. > > A person's Copyright doesn't go away just because they die, > disappear, or fail to respond. If you can't contact them, their > heirs, or whomever they transferred the Copyright to, you're stuck. so it's worse than a patent :) cheers luigi From das at FreeBSD.ORG Wed Mar 25 03:01:14 2009 From: das at FreeBSD.ORG (David Schultz) Date: Wed Mar 25 03:01:23 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325094100.GA915@onelab2.iet.unipi.it> References: <20090325084722.GC98685@onelab2.iet.unipi.it> <20090325093152.GB85469@zim.MIT.EDU> <20090325094100.GA915@onelab2.iet.unipi.it> Message-ID: <20090325100342.GA37547@zim.MIT.EDU> On Wed, Mar 25, 2009, Luigi Rizzo wrote: > On Wed, Mar 25, 2009 at 05:31:52AM -0400, David Schultz wrote: > > On Wed, Mar 25, 2009, Luigi Rizzo wrote: > > > Someone just asked me permission to move to a 3-clause BSD > > > copyright some piece of software that I haven't touched in 10+ years. > > > > > > I said yes, but then I was wondering what happens if the > > > person listed is not responding or not reachable anymore: > > > does copyright on source code expire, and if so, when ? > > > (I suppose it is related to either the date listed on the copyright, > > > or to the date of some remarkable event for the author). > > > > In the US, the rule that applies most of the time is that > > Copyright expires 70 years after the author dies, although there > > are many special cases where the term differs. > > > > A person's Copyright doesn't go away just because they die, > > disappear, or fail to respond. If you can't contact them, their > > heirs, or whomever they transferred the Copyright to, you're stuck. > > so it's worse than a patent :) In that sense, yes. But at least with Copyright you have the option of rewriting the code from scratch. In the U.S., there have been various attempts to improve the laws regarding orphaned works, but most of the good ideas run afoul of international Copyright treaties. From danny at cs.huji.ac.il Wed Mar 25 04:36:52 2009 From: danny at cs.huji.ac.il (Danny Braniss) Date: Wed Mar 25 04:36:59 2009 Subject: Intel Integrated Raid (iir) relevance Message-ID: It's no longer working (for me) under 7.2, and so far I am not getting any feedback, so since it seems that this particular hardware has reached EOL, I was wondering if, a) it's true, b) drop it, and replace it. c) should time be spent in getting it to work again. danny From johans at stack.nl Wed Mar 25 02:34:22 2009 From: johans at stack.nl (Johan van Selst) Date: Wed Mar 25 04:50:59 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325084722.GC98685@onelab2.iet.unipi.it> References: <20090325084722.GC98685@onelab2.iet.unipi.it> Message-ID: <20090325093416.GA66389@mud.stack.nl> Luigi Rizzo wrote: > I said yes, but then I was wondering what happens if the > person listed is not responding or not reachable anymore: > does copyright on source code expire, and if so, when ? Yes, copyright expires. When it expires exactly depends on local legislation. Generally this is a number of years after the death of the longest living author. I believe in the US copyright generally expires 70 years after the death - or if the copyright is owned by a company, then it expires 120 years after creation (or 95 years after publication). There is a lot of documentation about copyright details available on the internet - but keep in mind that laws change frequently. So it's probably safe to assume that all code used in FreeBSD is still covered by copyright and all licenses granted by the authors still apply (untill one day the copyright expires). P.S. As with all legal stuff, there are always exceptions to the rule: specific cases and local legislation may differ a lot. Ciao, Johan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 163 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090325/c4e249af/attachment.pgp From samflanker at gmail.com Wed Mar 25 06:31:03 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Wed Mar 25 06:31:10 2009 Subject: [problem] aac0 does not respond In-Reply-To: <20090324141136.GA46558@jem.dhs.org> References: <49C8AD9B.7000500@gmail.com> <3bbf2fe10903240324t6616cc9dx6ae28028ac971be6@mail.gmail.com> <49C8B5E3.2000104@gmail.com> <49C8D6AD.7050501@gmail.com> <20090324141136.GA46558@jem.dhs.org> Message-ID: <49CA321E.5070305@gmail.com> Ed Maste wrote: > 2009/3/24 Vladimir Ermakov : > >> Hello, All >> >> Describe my problem: >> have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 >> 2 HHD of 6 have errors in smart data (damaged) >> i am try read file /var/db/mysql/ibdata1 from this volume >> system does not respond ( lost access to ssh ) after read 6GB data >> > >from this > >> file >> and print debug messages on ttyv0 >> > > If the messages you see are the same as in the message to which you > provided a link ("COMMAND xxx TIMED OUT AFTER xxx SECONDS") it typically > means that the RAID controller has crashed. My initial suggestion is to > check the firmware version installed on your card, and update to the > latest from Adaptec's website if you're not running that one already. > > Attilio also has some driver updates (ported from Adaptec's latest > vendor driver) that you can try. The plan is to commit them sometime > soon, but he can forward those on for testing before that happens. > > -Ed > > Hello I updated the firmware [Build 16116] to [Build 16501]. Update does not fix problem. Where to get a some driver updates? /Vladimir Ermakov From yuri at rawbw.com Wed Mar 25 12:27:40 2009 From: yuri at rawbw.com (Yuri) Date: Wed Mar 25 12:27:46 2009 Subject: Atheros wireless card keeps losing signal when signal is too weak Message-ID: <49CA7D47.7070406@rawbw.com> I have Linux box sitting next to FreeBSD box that has a very cheap Airlink 101 card but it has no problems connecting to my WiFi network. Every time when Linux box says that quality of connection drops below 10/100 FreeBSD box shows "status: no carrier". Linux connections still function ok. I even bought a large WiFi antenna for FreeBSD box but still have this problem. Is there some 'sensitivity' parameter that driver may be setting too low on the card? 'iwconfig' on Linux shows some 'sensitivity' parameter=200. 7.1-STABLE ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: mem 0xcffe0000-0xcffeffff irq 16 at device 5.0 on pci0 Yuri From onemda at gmail.com Wed Mar 25 14:13:48 2009 From: onemda at gmail.com (Paul B. Mahol) Date: Wed Mar 25 14:14:12 2009 Subject: Atheros wireless card keeps losing signal when signal is too weak In-Reply-To: <49CA7D47.7070406@rawbw.com> References: <49CA7D47.7070406@rawbw.com> Message-ID: <3a142e750903251350l66801af4j26722a5b905a9a34@mail.gmail.com> On 3/25/09, Yuri wrote: > I have Linux box sitting next to FreeBSD box that has a very cheap > Airlink 101 card but it has no problems connecting to my WiFi network. > > Every time when Linux box says that quality of connection drops below > 10/100 FreeBSD box shows "status: no carrier". > Linux connections still function ok. > > I even bought a large WiFi antenna for FreeBSD box but still have this > problem. > > Is there some 'sensitivity' parameter that driver may be setting too low > on the card? I'm only aware of roam:rssi & roam:rate -- Paul From sam at freebsd.org Wed Mar 25 14:30:38 2009 From: sam at freebsd.org (Sam Leffler) Date: Wed Mar 25 14:30:45 2009 Subject: Atheros wireless card keeps losing signal when signal is too weak In-Reply-To: <3a142e750903251350l66801af4j26722a5b905a9a34@mail.gmail.com> References: <49CA7D47.7070406@rawbw.com> <3a142e750903251350l66801af4j26722a5b905a9a34@mail.gmail.com> Message-ID: <49CAA27A.6060602@freebsd.org> Paul B. Mahol wrote: > On 3/25/09, Yuri wrote: > >> I have Linux box sitting next to FreeBSD box that has a very cheap >> Airlink 101 card but it has no problems connecting to my WiFi network. >> >> Every time when Linux box says that quality of connection drops below >> 10/100 FreeBSD box shows "status: no carrier". >> Linux connections still function ok. >> >> I even bought a large WiFi antenna for FreeBSD box but still have this >> problem. >> >> Is there some 'sensitivity' parameter that driver may be setting too low >> on the card? >> > > I'm only aware of roam:rssi & roam:rate > > Those parameters control the roaming algorithm. The OP didn't identify their card, freebsd version, or provide any info about their setup or why ifconfig reports "no carrier". It just sounds like there's a loss in the signal and freebsd gets a beacon miss and tries to reconnect while linux does not. Once the rssi drops to "10" (presumably 5dBm) minor variations in the environment can become significant (e.g. orientation of a laptop, obstructions, antenna quality) and it's impossible to comment on what's happening w/o detailed information such as provided by athstats. FWIW cardbus cards that follow the reference design closely typically work pretty well and don't benefit from an external antenna. Vendors of cheap designs often scrimp when it comes to the antenna. When wireless is inside a case (e.g. a PCI card) then it's worth remoting the antenna but you need to be careful about routing the pigtail(s) and I can't count the number of times I've tracked problems down to faulty cables and/or connections. Sam From psteele at maxiscale.com Wed Mar 25 18:39:23 2009 From: psteele at maxiscale.com (Peter Steele) Date: Wed Mar 25 18:39:30 2009 Subject: WARNING: Expected rawoffset 0, found 63? In-Reply-To: <21432774.241238031049378.JavaMail.HALO$@halo> Message-ID: <3084677.261238031500941.JavaMail.HALO$@halo> I posted this on the questions list but didn't get a lot of traction. I've created GEOM mirrored file systems on two slices of my system's drives and everything seems to be working, but I get the warning s WARNING: Expected rawoffset 0, found 63 WARNING: Expected rawoffset 0, found 50332464 when the mirrors are being created. These correspond to the offsets for these slices in the partition table: # fdisk -p ad4 # /dev/ad4 g c484521 h16 s63 p 1 0xa5 63 50332401 a 1 p 2 0xa5 50332464 16778160 p 3 0xa5 67110624 421285536 Partition three is not mirror, just partitions 1 and 2. I use the following command to create the slice 1 mirror: gmirror label -v -n -b round-robin s1 and a similar one for slice 2. Additional drives are added to this mirror after the data has been copied to the mirrored file systems. The disks are setup with the required labels, including making sure the c partition is reduced in size by one sector. E.g.: # bsdlabel ad4s1 # /dev/ad4s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 10485760 16 4.2BSD 2048 16384 28528 c: 50332400 0 unused 0 0 # "raw" part, don't edit d: 8388608 10485776 4.2BSD 2048 16384 28528 e: 31457280 18874384 4.2BSD 2048 16384 28528 bsdlabel: partition c doesn't cover the whole unit! bsdlabel: An incorrect partition c may cause problems for standard system utilities # bsdlabel ad4s2 # /dev/ad4s2: 8 partitions: # size offset fstype [fsize bsize bps/cpg] b: 16778143 16 swap c: 16778159 0 unused 0 0 # "raw" part, don't edit bsdlabel: partition c doesn't cover the whole unit! bsdlabel: An incorrect partition c may cause problems for standard system utilities So as far as I can tell I have everything configured the way it should be and everything appears to be working fine, but these warnings worry me. Should I be worried? From yuri at rawbw.com Wed Mar 25 21:33:08 2009 From: yuri at rawbw.com (Yuri) Date: Wed Mar 25 21:33:15 2009 Subject: Atheros wireless card keeps losing signal when signal is too weak In-Reply-To: <49CAA27A.6060602@freebsd.org> References: <49CA7D47.7070406@rawbw.com> <3a142e750903251350l66801af4j26722a5b905a9a34@mail.gmail.com> <49CAA27A.6060602@freebsd.org> Message-ID: <49CB057E.8080900@rawbw.com> Sam Leffler wrote: > Those parameters control the roaming algorithm. The OP didn't > identify their card, freebsd version, or provide any info about their > setup or why ifconfig reports "no carrier". It just sounds like > there's a loss in the signal and freebsd gets a beacon miss and tries > to reconnect while linux does not. Once the rssi drops to "10" > (presumably 5dBm) minor variations in the environment can become > significant (e.g. orientation of a laptop, obstructions, antenna > quality) and it's impossible to comment on what's happening w/o > detailed information such as provided by athstats. > > FWIW cardbus cards that follow the reference design closely typically > work pretty well and don't benefit from an external antenna. Vendors > of cheap designs often scrimp when it comes to the antenna. When > wireless is inside a case (e.g. a PCI card) then it's worth remoting > the antenna but you need to be careful about routing the pigtail(s) > and I can't count the number of times I've tracked problems down to > faulty cables and/or connections. > I did identify my FreeBSD version and card in my original post, but here they are again: 7.1-STABLE ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: mem 0xcffe0000-0xcffeffff irq 16 at device 5.0 on pci0 One way or another little cheap laptop card with ndis driver delivers more steady connection then atheros pci card connected to freebsd. Maybe like you mentioned Linux has higher tolerance to missing beacons. Does it make sense to have a parameter "lost beakon tolerance"? Yuri From bruce at cran.org.uk Thu Mar 26 02:59:12 2009 From: bruce at cran.org.uk (Bruce Cran) Date: Thu Mar 26 02:59:18 2009 Subject: Overflow in vm.vmtotal expected when allocating huge amounts of memory? Message-ID: <20090326095849.1ec57dda@gluon.draftnet> Are overflows in the vm counters expected when dealing with huge amounts of memory on 64-bit platforms? I wrote an application which malloc'd 10TB memory and then sat doing nothing; vm.vmtotal showed -2132654356K. Shouldn't unsigned integers be used for any vm stats to avoid overflows? -- Bruce Cran From onemda at gmail.com Thu Mar 26 03:12:43 2009 From: onemda at gmail.com (Paul B. Mahol) Date: Thu Mar 26 03:12:54 2009 Subject: Atheros wireless card keeps losing signal when signal is too weak In-Reply-To: <49CB057E.8080900@rawbw.com> References: <49CA7D47.7070406@rawbw.com> <3a142e750903251350l66801af4j26722a5b905a9a34@mail.gmail.com> <49CAA27A.6060602@freebsd.org> <49CB057E.8080900@rawbw.com> Message-ID: <3a142e750903260312k20e34aafn49b6445c9c955adf@mail.gmail.com> On 3/26/09, Yuri wrote: > Sam Leffler wrote: >> Those parameters control the roaming algorithm. The OP didn't >> identify their card, freebsd version, or provide any info about their >> setup or why ifconfig reports "no carrier". It just sounds like >> there's a loss in the signal and freebsd gets a beacon miss and tries >> to reconnect while linux does not. Once the rssi drops to "10" >> (presumably 5dBm) minor variations in the environment can become >> significant (e.g. orientation of a laptop, obstructions, antenna >> quality) and it's impossible to comment on what's happening w/o >> detailed information such as provided by athstats. >> >> FWIW cardbus cards that follow the reference design closely typically >> work pretty well and don't benefit from an external antenna. Vendors >> of cheap designs often scrimp when it comes to the antenna. When >> wireless is inside a case (e.g. a PCI card) then it's worth remoting >> the antenna but you need to be careful about routing the pigtail(s) >> and I can't count the number of times I've tracked problems down to >> faulty cables and/or connections. >> > > I did identify my FreeBSD version and card in my original post, but here > they are again: > 7.1-STABLE > ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) > ath0: mem 0xcffe0000-0xcffeffff irq 16 at device 5.0 on pci0 > > One way or another little cheap laptop card with ndis driver delivers > more steady connection then atheros pci card connected to freebsd. > Maybe like you mentioned Linux has higher tolerance to missing beacons. > Does it make sense to have a parameter "lost beakon tolerance"? Perhaps this is what are you looking for: bmissthreshold count Set the number of consecutive missed beacons at which the station will attempt to roam (i.e., search for a new access point). The count parameter must be in the range 1 to 255; though the upper bound may be reduced according to device capabilities. The default threshold is 7 consecutive missed beacons; but this may be overridden by the device driver. Another name for the bmissthreshold parameter is bmiss. -- Paul From peterjeremy at optushome.com.au Thu Mar 26 03:17:42 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Thu Mar 26 03:17:53 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325093152.GB85469@zim.MIT.EDU> References: <20090325084722.GC98685@onelab2.iet.unipi.it> <20090325093152.GB85469@zim.MIT.EDU> Message-ID: <20090326095802.GH56137@server.vk2pj.dyndns.org> On 2009-Mar-25 05:31:52 -0400, David Schultz wrote: >In the US, the rule that applies most of the time is that >Copyright expires 70 years after the author dies, although there >are many special cases where the term differs. And the '70' gets regularly extended following pressure from the big content owners. As a rule of thumb, you can expect (eg) 'Mickey Mouse' to never be released from Copyright. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090326/d7c975d0/attachment.pgp From ttw+bsd at cobbled.net Thu Mar 26 04:48:33 2009 From: ttw+bsd at cobbled.net (ttw+bsd@cobbled.net) Date: Thu Mar 26 04:48:39 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090325093152.GB85469@zim.MIT.EDU> References: <20090325084722.GC98685@onelab2.iet.unipi.it> <20090325093152.GB85469@zim.MIT.EDU> Message-ID: <20090326114828.GA2840@holyman.cobbled.net> On 25.03-05:31, David Schultz wrote: [ ... ] > A person's Copyright doesn't go away just because they die, > disappear, or fail to respond. If you can't contact them, their > heirs, or whomever they transferred the Copyright to, you're stuck. yeah but it's a little like finding something. if there not about and not reachable there isn't much they can do to stop you using it. if they popup and make demands later then you get to choose between re-writes and haggling (twenty shekels is standard). point is you "can" use it, the actual copyright owner needs to sue you; not like saying "jehovah" which may result in action by the agents of the state. n.b: using the above opinion may get you crucified. From prashant.vaibhav at gmail.com Thu Mar 26 06:22:15 2009 From: prashant.vaibhav at gmail.com (Prashant Vaibhav) Date: Thu Mar 26 06:22:22 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) Message-ID: <17560ccf0903260551v1f5cba9eu87727c0bae7baa3@mail.gmail.com> Hi everyone, I'm a potential Google Summer of Code applicant, proposing to work on improving the timecounter performance in the FreeBSD kernel (suggestion from Timecounter Performance Improvements). My qualifications are mentioned at the end of this email, for those interested. After some initial discussion in #freebsd-soc, I'm posting this to the mailing lists (and CC'ing it to specific people) for further discussion before I finalize and submit my application. The primary idea is to improve the performance and resolution of gettimeofday() and friends by creating a efficient userspace implementation of these functions, along with some supporting modifications to the kernel. According to my understanding, currently the gettimeofday() function calls into the kernel to retrieve the timing information to pass on to user apps. I propose to improve it as follows: Export the relevant timing information to a shared page in memory, which will be mapped into every user app's address space. The gettimeofday() function's implementation will then be changed to read the timestamp counter (TSC) from the processor, and use the reading in conjunction with the timing info exported by the kernel to calculate and return the time info in proper format. The TSC can be read very efficiently from userspace (currently this is the fastest and highest resolution timer available, beating HPET, PIT, RTC etc.). This will allow applications to have a very fast and more importantly, a higher resolution timer available to them. This will also pave way for optionally making the FreeBSD kernel tickless, which would help with efficiency and power consumption (the processor will be able to sleep for longer durations without having to service timer interrupts several hundred times a second). Other operating systems (like OS X) already do this to varying extent. There are several issues with this approach however, and I plan to tackle each of them so that there is no loss of functionality or accuracy, and certainly no loss of performance. The project will be completed in stages, tackling each of these issues ? - Implement the exporting of shared system-wide pages to be mapped into each process. (There has been some work done in this area: Avoiding syscall overhead). This page will contain timing info. - Have the kernel read and export the information related to TSC during boot-up. This is heavily processor dependent and each processor (those from Intel/AMD) has its own peculiarities. The kernel should provide at least the TSC frequency by which the TSC read from userspace can be scaled to get nanosecond time. Wall time offset at boot-time should also be exported so TSC can be converted to wall time. - The TSC frequency might change on certain processors with non-constant TSC rate (because of SpeedStep, dynamic freq scaling etc.). The only way to combat this is that the kernel be notified every time the processor frequency changes. Every cpu frequency driver will need to be updated to notify the kernel before and after a cpu freq change. The tsc frequency will then need to be adjusted in the exported info. This does not apply to modern processors (Intel Core or higher and recent AMD processors, both of which have a constant tsc rate). - On multiprocessor systems, threads might bounce between different processors. There are two problems here: The TSC of each core could have an offset relative to each other, and the TSC of each core could have a drifting frequency. The first issue is found on most multicore CPUs, and will be solved by measuring the offset at boot-time and exporting this info so that the tsc read by the user app can be corrected based on the core it's running on. The second issue only applies to AMD Athlon X2 during C1 state. This is solved by following AMD's recommendation: disable c1 clock ramping during bootup and suspend/resume by updating relevant info in the northbridge configuration. - In case we have some time left before completion of GSoC, one more thing can be added. Scaling the processor frequency up and down takes a finite amount of time (tens to hundreds of microseconds). During this time, the tsc frequency is undefined. Since we will be notified both before and after such a change (by the cpufreq drivers), an alternate source (like HPET or RTC) can be used to measure this duration and correct the tsc offset after the switch. Given all this is handled carefully, we will be able to use the TSC read-out as either: (1) an offset from the last-updated timestamp (updated HZ times every second, on each timer interrupt). Or (2) use the TSC exclusively for timing and disable the timer interrupt. Currently the first approach will be used. This will avoid having to call into the kernel to get the timing info, as well as provide finer resolution timing. The second approach is an extension to allow for a tickless kernel (not part of my proposal, but do-able in the future). To summarize: The kernel exports a shared page mapped into each process and set as read-only. This page is updated on each clock tick to contain the time. This page also contains the tsc frequency and other information, which is potentially updated every time this info changes. The userspace implementation of gettimeofday() reads the timestamp counter from the processor, and the scale, offset etc. from the shared page to convert it to nanoseconds. This offset is then added to the last updated nano time (also present in the shared page) and returned to the application. The various peculiarities of each processor's tsc implementation will be accounted for. We will also need to make comprehensive benchmarks and tests to assert the validity and performance benefits. I am not well versed with rigorous benchmarking so this part of the project would need additional thought. My qualifications / personal details: I'm a 22 year old Indian male. I'm an undergrad in Electrical Engineering & Computer Science at Jacobs University Bremen, Germany. I have years of experience in C/C++ and varying job experiences ranging from web development to human-computer interaction devices. I've taken courses in computer architecture and operating systems. More details will be listed on my application, for now I'll mention the experience most relevant to the task at hand ? Since August 2008, I've started and completed a port of the Darwin XNU kernel (used by OS X), for generic x86 PCs. (Webpage: http://code.google.com/p/xnu-dev) Among other things, I added lots of rtc/tsc improvements to Apple's implementation that deals with exactly the same problems I have described above. All issues were solved, and the kernel is being used in production of thousands of computers worldwide (including the computer I'm typing this on!). Most of the code was written by me, with support from a few other people, so I have a fair idea of the challenge and their solutions. The tsc multicore synchronization was written independently by two other people, so this is the part with which I'm least familiar. The code is already implemented for XNU and it works well: so most of the work would be porting it to BSD. Since I'm the author of most of it, and have good contact with the other 3-4 people who contributed other parts, there should be no licensing issues. I've also written a SpeedStep driver for OS X (http://code.google.com/p/xnu-speedstep), which sends clock recalibration signals to the kernel (also made relevant modifications in the kernel for this to work). What I still need to learn/plan My experience with FreeBSD is somewhat limited. I have a dragonflyBSD based home server (because freebsd didn't have drivers for its cheap ethernet card). My kernel programming experience is also limited to the XNU kernel (since about July last year) and I've helped fix a minor bug (typo in ethernet driver PCI ID) in dfbsd kernel. But I'm a fast learner, and given the very well commented and clear code in the freebsd kernel, I should be up to speed pretty soon. Right now I've installed freebsd in a virtual machine and am playing around with it. Will shortly try building the kernel and maybe make small modifications, figure out exactly which parts of the kernel will need modifications. I've also been reading the freebsd handbook, the "arch" book and the dev handbook. Another big problem for me would be making the modifications to export the shared page and map it into each process ? my experience is mostly in handling the tsc/rtc code, but not in memory management, so this is something I need to learn. Lastly, I'm not very well-versed in making rigorous benchmarks. I've done simple benchmarking during the xnu kernel development, but these were limited to measuring clock ticks. A more comprehensive test plan would include mysql benchmarks and similar. Thanks everyone for reading through this humongous email! :-) Discussion commenceth ? Best, Prashant Vaibhav PS: I am out of town with limited connectivity so responses could be somewhat slow. My aim however is to finalize and submit the application by the end of the month. From bsd.quest at googlemail.com Thu Mar 26 10:16:25 2009 From: bsd.quest at googlemail.com (Alexej Sokolov) Date: Thu Mar 26 10:16:32 2009 Subject: Intel Pro 82546GB COPPER. Frames reception by disabled interrupts Message-ID: <671bb5fc0903261016s74de6a56va2ae7f7127eef286@mail.gmail.com> Hello, interrupts disable: E1000_WRITE_REG(&adapter->hw, E1000_IMC, 0xffffffff); this clears interrupt mask register. Question: Will network adapter accept incoming frames and transfer them to hast memory by disabled interrupts ? Thenx, Alexej From rwatson at FreeBSD.org Thu Mar 26 14:42:50 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Thu Mar 26 14:42:56 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090326114828.GA2840@holyman.cobbled.net> References: <20090325084722.GC98685@onelab2.iet.unipi.it> <20090325093152.GB85469@zim.MIT.EDU> <20090326114828.GA2840@holyman.cobbled.net> Message-ID: On Thu, 26 Mar 2009, ttw+bsd@cobbled.net wrote: > On 25.03-05:31, David Schultz wrote: > [ ... ] >> A person's Copyright doesn't go away just because they die, disappear, or >> fail to respond. If you can't contact them, their heirs, or whomever they >> transferred the Copyright to, you're stuck. > > yeah but it's a little like finding something. if there not about and not > reachable there isn't much they can do to stop you using it. if they popup > and make demands later then you get to choose between re-writes and haggling > (twenty shekels is standard). In some countries, such as the US, copyright violation can be a criminal, not just civil, matter. Also, in countries where copyright can be assigned, the holder listed in a file may not accurately represent who the current holder is, so while the original author may be unreachable, etc, the current holder may be alive and kicking. Robert N M Watson Computer Laboratory University of Cambridge > > point is you "can" use it, the actual copyright owner needs to sue > you; not like saying "jehovah" which may result in action by the agents > of the state. > > n.b: using the above opinion may get you crucified. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From mwm-keyword-freebsdhackers2.e313df at mired.org Thu Mar 26 15:24:52 2009 From: mwm-keyword-freebsdhackers2.e313df at mired.org (Mike Meyer) Date: Thu Mar 26 15:24:59 2009 Subject: does Copyright on source files expire ? In-Reply-To: <20090326095802.GH56137@server.vk2pj.dyndns.org> References: <20090325084722.GC98685@onelab2.iet.unipi.it> <20090325093152.GB85469@zim.MIT.EDU> <20090326095802.GH56137@server.vk2pj.dyndns.org> Message-ID: <20090326175642.4fafdde3@bhuda.mired.org> On Thu, 26 Mar 2009 20:58:02 +1100 Peter Jeremy wrote: > On 2009-Mar-25 05:31:52 -0400, David Schultz wrote: > >In the US, the rule that applies most of the time is that > >Copyright expires 70 years after the author dies, although there > >are many special cases where the term differs. > > And the '70' gets regularly extended following pressure from the big > content owners. As a rule of thumb, you can expect (eg) 'Mickey > Mouse' to never be released from Copyright. You forgot "and anything with a copyright newer than that one, which includes anything in the US written after March 1, 1989." after the words "Mickey Mouse". And yes, chasing down the owners is a PITA. IIRC, Paul Allen did a DVD retrospective of John Wayne's movies, and it cost more to chase down and obtain rights from all the people involved than it did to produce the DVD. But that's the way the big content owners want it. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From phk at phk.freebsd.dk Thu Mar 26 15:54:32 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu Mar 26 15:54:46 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: Your message of "Thu, 26 Mar 2009 18:21:54 +0530." <17560ccf0903260551v1f5cba9eu87727c0bae7baa3@mail.gmail.com> Message-ID: <1170.1238103059@critter.freebsd.dk> In message <17560ccf0903260551v1f5cba9eu87727c0bae7baa3@mail.gmail.com>, Prasha nt Vaibhav writes: >The gettimeofday() function's implementation will then be >changed to read the timestamp counter (TSC) from the processor, and use the >reading in conjunction with the timing info exported by the kernel to >calculate and return the time info in proper format. I take it as read, that you know that there are other relvant functions than gettimeofday() and that these must provide a monotonic timescale when queried interleaved ? Be aware that the TSC may not be, and may not stay synchronized across multiple cores. Further more, the TSC is not constant frequency and in particular not "known frequency" at all times. There are a lot of nasty cases to check, and a nasty interpolation required, which, in my tests some years back, totally negated any speedup from using the TSC in the first place. At the very minimum, you will have to add a quirk table where known good {CPU+MOBO+BIOS} combinations can be entered, as we find them. >This will also pave way for optionally making the >FreeBSD kernel tickless, Rubbish. Timecounters are not even closely associated with the tick or ticklessness of the kernel. [1] > - The TSC frequency might change on certain processors with non-constant > TSC rate (because of SpeedStep, dynamic freq scaling etc.). The only way to > combat this is that the kernel be notified every time the processor > frequency changes. Every cpu frequency driver will need to be updated to > notify the kernel before and after a cpu freq change. That is not good enough, the bios may autonomously change the cpu speed and the skew from not knowing exactly _when_ and _how_ the cpu clock changed, is a significant number of microseconds, plenty of time to make strange things happen. You will want to study carefully Dave Mills work to tame the alpha chips wandering SAW clocks. Poul-Henning [1] In my mind, reworking the callout system in the kernel would be a much better more neded and much more worthwhile project. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From guru at unixarea.de Fri Mar 27 01:42:35 2009 From: guru at unixarea.de (Matthias Apitz) Date: Fri Mar 27 01:42:42 2009 Subject: doing 'make installworld / installkernel' a second time? Message-ID: <20090327083023.GA2140@rebelion.Sisis.de> Hello, I've created a bootable USB key with -CURRENT like this: # mkdir -p /usr/src/CURRENT/obj # cd /usr/src/CURRENT # setenv CVSROOT :pserver:anoncvs@anoncvs.fr.FreeBSD.org:/home/ncvs # cvs login # cvs checkout src # cd /usr/src/CURRENT/src # setenv MAKEOBJDIRPREFIX /usr/src/CURRENT/obj # make buildworld # make buildkernel KERNCONF=GENERIC (USB key inserted as /dev/da0) # fdisk -I da0 # fdisk -B da0 # bsdlabel -w da0s1 auto # bsdlabel -B da0s1 # newfs /dev/da0s1a # mount /dev/da0s1a /mnt # make installworld DESTDIR=/mnt # make installkernel DESTDIR=/mnt KERNCONF=GENERIC INSTALL_NODEBUG=t # make distrib-dirs DESTDIR=/mnt # make distribution DESTDIR=/mnt # echo /dev/da0s1a / ufs rw 1 1 > /mnt/etc/fstab # cat < /mnt/etc/rc.conf wlans_ath0="wlan0" ifconfig_wlan0="WPA DHCP" hostname=tinyCurrent sshd_enable="YES" EOF-EOF-EOF # cp /etc/wpa_supplicant.conf /mnt/etc # umount /mnt the resulting USB key boots fine; what I'm unsure about is: can I copy /usr/src/CURRENT onto the key with # cp -rp /usr/src/CURRENT /mnt and when it is booted (in my EeePC) can I do there the installation to the SSD again with # newfs -m 0 -o space /dev/ad2s1a # mount /dev/ad2s1a /mnt # setenv MAKEOBJDIRPREFIX /CURRENT/obj # cd /CURRENT/src # make installworld DESTDIR=/mnt # make installkernel DESTDIR=/mnt KERNCONF=GENERIC INSTALL_NODEBUG=t # make distrib-dirs DESTDIR=/mnt # make distribution DESTDIR=/mnt or is /CURRENT/src and /CURRENT/obj not enough for the 2nd installation, for example because the 1st 'make installworld' has removed stuff below /usr/src/CURRENT/obj? Thx matthias -- Matthias Apitz Manager Technical Support - OCLC GmbH Gruenwalder Weg 28g - 82041 Oberhaching - Germany t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211 e - w http://www.oclc.org/ http://www.UnixArea.de/ From won.derick at yahoo.com Fri Mar 27 03:17:05 2009 From: won.derick at yahoo.com (Won De Erick) Date: Fri Mar 27 03:17:12 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards Message-ID: <17314.10813.qm@web45811.mail.sp1.yahoo.com> Hi All, I'm not quite familiar with FreeBSD, but I want to do the following in 6.2/7.1. /* Raise IOPL to 3 to open all I/O ports */ /* something like 'i386_iopl(3)' */ ... /* Open SMRAM access */ outl(unsigned int port, unsigned long int data); Also, I appreciate comments on the following wrapper: static inline outl(unsigned int port, unsigned long int data) { asm("outl %0, %1" : : "a" (data), "dN" (port)); } My goal is to switch the processor to SMM by triggering SMI from userland. Thanks in advance, Won From takawata at init-main.com Fri Mar 27 03:24:01 2009 From: takawata at init-main.com (Takanori Watanabe) Date: Fri Mar 27 03:24:08 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards In-Reply-To: Your message of "Fri, 27 Mar 2009 03:03:14 MST." <17314.10813.qm@web45811.mail.sp1.yahoo.com> Message-ID: <200903271021.n2RALixB062663@sana.init-main.com> In message <17314.10813.qm@web45811.mail.sp1.yahoo.com>, Won De Erick wrote: > >Hi All, > >I'm not quite familiar with FreeBSD, but I want to do the following in 6.2/7.1 >. > > /* Raise IOPL to 3 to open all I/O ports */ > /* something like 'i386_iopl(3)' */ > ... see i386_get_ioperm(2) or io(4). > /* Open SMRAM access */ > outl(unsigned int port, unsigned long int data); > > >Also, I appreciate comments on the following wrapper: > >static inline outl(unsigned int port, unsigned long int data) >{ > asm("outl %0, %1" : : "a" (data), "dN" (port)); >} > > >My goal is to switch the processor to SMM by triggering SMI from userland. Probably this will work. So what do you want ask about that? From ivoras at freebsd.org Fri Mar 27 03:36:20 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Mar 27 03:36:27 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards In-Reply-To: <200903271021.n2RALixB062663@sana.init-main.com> References: <17314.10813.qm@web45811.mail.sp1.yahoo.com> <200903271021.n2RALixB062663@sana.init-main.com> Message-ID: Takanori Watanabe wrote: > In message <17314.10813.qm@web45811.mail.sp1.yahoo.com>, Won De Erick wrote: >> Hi All, >> >> I'm not quite familiar with FreeBSD, but I want to do the following in 6.2/7.1 >> . >> >> /* Raise IOPL to 3 to open all I/O ports */ >> /* something like 'i386_iopl(3)' */ >> ... > > see i386_get_ioperm(2) or io(4). > >> /* Open SMRAM access */ >> outl(unsigned int port, unsigned long int data); >> >> >> Also, I appreciate comments on the following wrapper: >> >> static inline outl(unsigned int port, unsigned long int data) >> { >> asm("outl %0, %1" : : "a" (data), "dN" (port)); >> } >> >> >> My goal is to switch the processor to SMM by triggering SMI from userland. > > > Probably this will work. > So what do you want ask about that? One thing that comes to my mind is this: http://invisiblethingslab.com/resources/misc09/smm_cache_fun.pdf :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090327/1c8939d5/signature.pgp From avg at icyb.net.ua Fri Mar 27 05:23:50 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Mar 27 05:24:02 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards In-Reply-To: References: <17314.10813.qm@web45811.mail.sp1.yahoo.com> <200903271021.n2RALixB062663@sana.init-main.com> Message-ID: <49CCC552.5070001@icyb.net.ua> on 27/03/2009 12:35 Ivan Voras said the following: > Takanori Watanabe wrote: >> In message <17314.10813.qm@web45811.mail.sp1.yahoo.com>, Won De Erick wrote: >>> Hi All, >>> >>> I'm not quite familiar with FreeBSD, but I want to do the following in 6.2/7.1 >>> . >>> >>> /* Raise IOPL to 3 to open all I/O ports */ >>> /* something like 'i386_iopl(3)' */ >>> ... >> see i386_get_ioperm(2) or io(4). >> >>> /* Open SMRAM access */ >>> outl(unsigned int port, unsigned long int data); >>> >>> >>> Also, I appreciate comments on the following wrapper: >>> >>> static inline outl(unsigned int port, unsigned long int data) >>> { >>> asm("outl %0, %1" : : "a" (data), "dN" (port)); >>> } >>> Take a look at machine/cpufunc.h >>> My goal is to switch the processor to SMM by triggering SMI from userland. >> >> Probably this will work. >> So what do you want ask about that? > > One thing that comes to my mind is this: > http://invisiblethingslab.com/resources/misc09/smm_cache_fun.pdf > > :) Yeah, and IDA Pro rocks too :-) -- Andriy Gapon From prashant.vaibhav at gmail.com Fri Mar 27 05:56:00 2009 From: prashant.vaibhav at gmail.com (Prashant Vaibhav) Date: Fri Mar 27 05:56:07 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <1170.1238103059@critter.freebsd.dk> References: <17560ccf0903260551v1f5cba9eu87727c0bae7baa3@mail.gmail.com> <1170.1238103059@critter.freebsd.dk> Message-ID: <17560ccf0903270555oe7d1652p7414a221aa2d6167@mail.gmail.com> Poul-Henning, Thanks for the feedback! >[...] these must provide a monotonic timescale when queried interleaved > ? Be aware that the TSC may not be, and may not stay synchronized across > multiple cores. The TSC is documented to be monotonically increasing across all x86 processors that implement it (that I'm aware of). I know that the TSC may not stay synchronized across multiple cores *in theory*. Practically, across most processors the only real issue has been an offset in the tsc of cores relative to each other (which can be measured and accounted for), or one core losing some ticks wrt the other during specific sleep states (this can be disabled and is recommended by AMD and Red Hat Linux). >Further more, the TSC is not constant frequency and in particular not > "known frequency" at all times. The TSC is guaranteed to be constant frequency on relatively modern processors from Intel and AMD ? whether the processor we are running on supports constant TSC rate can be queried via a CPUID instruction. The frequency can be measured at boot time by using another timing source such as the PIT, or read directly off the CPU for some models. >There are a lot of nasty cases to check, I have implemented many such 'nasty checks' over the past several months during my work with the xnu kernel ? I might have missed some, however. They are all done once during system boot (and during resume from sleep on some AMD dual cores). They're not very involved in my opinion. >and a nasty interpolation required, Could you please elaborate or hint me on some terms I can google about the interpolations that are required? Are you referring to the interpolation needed during measuring the tsc frequency to account for the (weird) duration of PIT? This happens during bootup only. >which, in my tests some years back, totally negated any speedup from using > the TSC in the first place. This could be an issue: I have not made extensive benchmarks. The benefit of using TSC could still be: the availability of a higher resolution timer which can be accessed from userspace. >At the very minimum, you will have to add a quirk table where known good > {CPU+MOBO+BIOS} combinations can be entered, as we find them. Perhaps. Or alternatively, a quirk table for known *bad* combinations. In my experience, most current x86 processors are OK (tested on Intel Pentium 4 and above, and AMD Athlons and above, with a variety of motherboard/BIOSes). >Rubbish. Timecounters are not even closely associated with the tick or > ticklessness of the kernel. My understanding could be flawed here, but the reasoning was: for a tickles kernel, we need some sort of monotonically increasing, known-rate counter as a replacement for periodic timer interrupts. Using the TSC (or HPET) would allow us to do so. Unless the alternative is to read the RTC at each call of gettimeofday() et al, which itself is not foolproof (eg. the user updates the hardware clock on a running system). I'm not aware of other high-resolution counters on the x86 platform which can serve this purpose. The PIT could be read, but it has too little range (16 bits iirc?) to be useful unless proper wraparound is done. The TSC is 64bits wide and guaranteed not to wrap around for 10 years or more (cf. Intel manuals). >the bios may autonomously change the cpu speed True. This could be an issue. For XNU and the SpeedStep driver we made, we combat this by disabling such BIOS-initiated frequency changes (refer: VoodooPower www.superhai.com/darwin.html ) >not knowing exactly _when_ and _how_ the cpu clock changed, is a > significant number of microseconds, plenty of time to make strange things > happen. Yes, for BIOS-initiated cpu frequency changes. For cpufreq driver-initiated changes, as I mentioned, the kernel can be notified before and after each change, the duration can be timed using an external timer and accounted for. >You will want to study carefully Dave Mills work to tame the alpha chips > wandering SAW clocks. Will do. I just started reading your paper on timecounters in FreeBSD which has been quite informative! Best, Prashant Vaibhav On Fri, Mar 27, 2009 at 3:00 AM, Poul-Henning Kamp wrote: > In message <17560ccf0903260551v1f5cba9eu87727c0bae7baa3@mail.gmail.com>, > Prasha > nt Vaibhav writes: > > >The gettimeofday() function's implementation will then be > >changed to read the timestamp counter (TSC) from the processor, and use > the > >reading in conjunction with the timing info exported by the kernel to > >calculate and return the time info in proper format. > > I take it as read, that you know that there are other relvant > functions than gettimeofday() and that these must provide a > monotonic timescale when queried interleaved ? > > Be aware that the TSC may not be, and may not stay synchronized > across multiple cores. > > Further more, the TSC is not constant frequency and in particular > not "known frequency" at all times. > > There are a lot of nasty cases to check, and a nasty interpolation > required, which, in my tests some years back, totally negated any > speedup from using the TSC in the first place. > > At the very minimum, you will have to add a quirk table where > known good {CPU+MOBO+BIOS} combinations can be entered, as we > find them. > > >This will also pave way for optionally making the > >FreeBSD kernel tickless, > > Rubbish. Timecounters are not even closely associated with the > tick or ticklessness of the kernel. [1] > > > - The TSC frequency might change on certain processors with > non-constant > > TSC rate (because of SpeedStep, dynamic freq scaling etc.). The only > way to > > combat this is that the kernel be notified every time the processor > > frequency changes. Every cpu frequency driver will need to be updated > to > > notify the kernel before and after a cpu freq change. > > That is not good enough, the bios may autonomously change the cpu speed > and the skew from not knowing exactly _when_ and _how_ the cpu clock > changed, is a significant number of microseconds, plenty of time > to make strange things happen. > > You will want to study carefully Dave Mills work to tame the alpha > chips wandering SAW clocks. > > Poul-Henning > > [1] In my mind, reworking the callout system in the kernel would > be a much better more neded and much more worthwhile project. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. > From phk at phk.freebsd.dk Fri Mar 27 06:46:10 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri Mar 27 06:46:24 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: Your message of "Fri, 27 Mar 2009 18:25:58 +0530." <17560ccf0903270555oe7d1652p7414a221aa2d6167@mail.gmail.com> Message-ID: <4955.1238161567@critter.freebsd.dk> In message <17560ccf0903270555oe7d1652p7414a221aa2d6167@mail.gmail.com>, Prashant Vaibhav writes: >>[...] these must provide a monotonic timescale when queried interleaved >> ? Be aware that the TSC may not be, and may not stay synchronized across >> multiple cores. > >The TSC is documented to be monotonically increasing [...] Notice the absence of the word "regular" ? That it is "monotonically increasing" just means that it does not count backwards (except on the buggy cpu-revs where it does). It does not mean that it counts upwards at a stable or constant rate. >>Further more, the TSC is not constant frequency and in particular not >> "known frequency" at all times. > >The TSC is guaranteed to be constant frequency on relatively modern >processors from Intel and AMD [...] Which is why you will neeed a {CPU+MOBO+BIOS} table of known good combinations: the majority of systems out there does not guarantee and some of those that do lie. Or have bugs. Or both. >>There are a lot of nasty cases to check, > >They're not very involved in my opinion. Then you likely have not done enough :-) >>and a nasty interpolation required, > >Could you please elaborate or hint me on some terms I can google about the >interpolations that are required? Are you referring to the interpolation >needed during measuring the tsc frequency to account for the (weird) >duration of PIT? This happens during bootup only. I'm talking about the systems where SMM bios operations cause the different CPU's TSC to develop skew over time. >>which, in my tests some years back, totally negated any speedup from using >> the TSC in the first place. > >This could be an issue: I have not made extensive benchmarks. The benefit of >using TSC could still be: the availability of a higher resolution timer >which can be accessed from userspace. We have the same resolution today, if you dare to enable TSC in the kernel. In fact, we have even better resolution, because the "struct bintime" format is much more precise than both timespec and timeval. So far I doubt anybody but me have tried to measure that this makes a difference :-) >>At the very minimum, you will have to add a quirk table where known good >> {CPU+MOBO+BIOS} combinations can be entered, as we find them. > >Perhaps. >Or alternatively, a quirk table for known *bad* combinations. No, FreeBSD is shipped "working by default", not "possibly working" by default and particularly not in an area, where the signs of trouble are so subtle that most people don't recognize them at all and just blame it on "random buggy crap". >>Rubbish. Timecounters are not even closely associated with the tick or > >My understanding could be flawed here, but the reasoning was: for a tickles >kernel, we need some sort of monotonically increasing, known-rate counter as >a replacement for periodic timer interrupts. We already have that in FreeBSD for CPU time accounting. The crucial fact about a tickless kernel, is that it does not take an interrupt N times a second just to see if there is anything to do in the callout queue, but instead uses the hardware timer to aim an interrupt at the next time it needs to wake up. >>the bios may autonomously change the cpu speed > >True. This could be an issue. Your optimism is cute but misguided. On most laptops the bios WILL change the CPU speed without notice in reaction to temperature and battery power. Let me repeat: >> [1] In my mind, reworking the callout system in the kernel would >> be a much better more neded and much more worthwhile project. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From won.derick at yahoo.com Fri Mar 27 07:00:14 2009 From: won.derick at yahoo.com (Won De Erick) Date: Fri Mar 27 07:00:21 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards Message-ID: <492862.81876.qm@web45808.mail.sp1.yahoo.com> --- On Fri, 3/27/09, Andriy Gapon wrote: > on 27/03/2009 12:35 Ivan Voras said > the following: > > Takanori Watanabe wrote: > >> In message <17314.10813.qm@web45811.mail.sp1.yahoo.com>, > Won De Erick wrote: > >>> Hi All, > >>> > >>> I'm not quite familiar with FreeBSD, but I > >>> want to do the following in 6.2/7.1 > >>> . > >>> > >>>? /* Raise IOPL to 3 to open all I/O ports > >>> */ > >>>? /* something like 'i386_iopl(3)' */ > >>>? ... > >> see? i386_get_ioperm(2) or io(4). > >> > >>>? /* Open SMRAM access */ > >>>? outl(unsigned int port, unsigned long > >>> int data); > >>> > >>> > >>> Also, I appreciate comments on the following > >>> wrapper: > >>> > >>> static inline outl(unsigned int port, unsigned > >>> long int data) > >>> { > >>>? asm("outl %0, %1" : : "a" (data), "dN" > >>> (port)); > >>> } > >>> > > Take a look at machine/cpufunc.h Oh I see. :) > > >>> My goal is to switch the processor to SMM by > >>> triggering SMI from userland. > >> > >> Probably this will work. > >> So what do you want ask about that? If it is possible, I should want to write data to certain registers or portion of a memory where the BIOS firmware or the BMC firmware could possibly detect it as 'reconfiguration', and make significant log on SEL as "System Reconfigured". If someone has a better idea, it is very much welcome. > > > > One thing that comes to my mind is this: > > http://invisiblethingslab.com/resources/misc09/smm_cache_fun.pdf I will add that to the ff: http://www.ssi.gouv.fr/fr/sciences/fichiers/lti/cansecwest2006-duflot-paper.pdf I've made the Exploit code found at the appendix runnable on FreeBSD 7.1 replacing some of the unsupported functions, but I'm still finding ways how to verify whether I've written successfully a data to the intended address or not. I've replaced '/dev/xf86 with '/dev/mem'. Then opened 'dev/io' instead of using 'i386_get_ioperm()'. Am I on the right track? > > > > :) > > Yeah, and IDA Pro rocks too :-) > > > -- > Andriy Gapon From avg at icyb.net.ua Fri Mar 27 07:41:39 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Mar 27 07:41:46 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards In-Reply-To: <492862.81876.qm@web45808.mail.sp1.yahoo.com> References: <492862.81876.qm@web45808.mail.sp1.yahoo.com> Message-ID: <49CCE59E.6020606@icyb.net.ua> on 27/03/2009 15:47 Won De Erick said the following: > --- On Fri, 3/27/09, Andriy Gapon wrote: >> on 27/03/2009 12:35 Ivan Voras said the following: >>> One thing that comes to my mind is this: >>> http://invisiblethingslab.com/resources/misc09/smm_cache_fun.pdf > > I will add that to the ff: > > http://www.ssi.gouv.fr/fr/sciences/fichiers/lti/cansecwest2006-duflot-paper.pdf > > > I've made the Exploit code found at the appendix runnable on FreeBSD 7.1 > replacing some of the unsupported functions, but I'm still finding ways how to > verify whether I've written successfully a data to the intended address or not. > I've replaced '/dev/xf86 with '/dev/mem'. Then opened 'dev/io' instead of using > 'i386_get_ioperm()'. Am I on the right track? I believe yes. I made identical changes to Joanna/Rafal's code that gets a glimpse of what SMI handler does via CPU cache. Interesting read :) -- Andriy Gapon From won.derick at yahoo.com Fri Mar 27 08:06:24 2009 From: won.derick at yahoo.com (Won De Erick) Date: Fri Mar 27 08:06:31 2009 Subject: Switching to SMM with FreeBSD 6.2 onwards Message-ID: <313076.76815.qm@web45801.mail.sp1.yahoo.com> --- On Fri, 3/27/09, Andriy Gapon wrote: > on 27/03/2009 15:47 Won De Erick said > the following: > > --- On Fri, 3/27/09, Andriy Gapon > wrote: > >> on 27/03/2009 12:35 Ivan Voras said the > following: > >>> One thing that comes to my mind is this: > >>> http://invisiblethingslab.com/resources/misc09/smm_cache_fun.pdf > > > > I will add that to the ff: > > > > http://www.ssi.gouv.fr/fr/sciences/fichiers/lti/cansecwest2006-duflot-paper.pdf > > > > > > I've made the Exploit code found at the appendix > runnable on FreeBSD 7.1 > > replacing some of the unsupported functions, but I'm > still finding ways how to > > verify whether I've written successfully a data to the > intended address or not. > > I've replaced '/dev/xf86 with '/dev/mem'. Then opened > 'dev/io' instead of using > > 'i386_get_ioperm()'. Am I on the right track? > > I believe yes. I made identical changes to Joanna/Rafal's > code that gets a glimpse > of what SMI handler does via CPU cache. Interesting read > :) Have you tried modifying some chipset configurations? Can I know what part? I am using IBM x3650 with dual core Xeon processor. > > -- > Andriy Gapon > Hi all, is there any tool that I can use to view the memory map I/O? From guru at unixarea.de Fri Mar 27 08:10:58 2009 From: guru at unixarea.de (Matthias Apitz) Date: Fri Mar 27 08:11:07 2009 Subject: CURRENT sees only /dev/ad2s1a, but not /dev/ad3s1a Message-ID: <20090327151052.GA13243@rebelion.Sisis.de> Hello, When I boot my EeePC from USB key (/dev/da0s1a) -CURRENT it sees the two SSD only as $ ls -l /dev/ad* /dev/ad2 /dev/ad2s1 /dev/ad2s1a /dev/ad3 /dev/ad3a I can mount /dev/ad2s1a but ofc not /dev/ad3s1a; when I'm booting the RELENG_7 from /dev/ad2s1a itself it looks like this: $ mount /dev/ad2s1a on / (ufs, local, noatime) /dev/ad3s1a on /usr/home (ufs, local, noatime) ... and all runs fine; more details below... What could be the reason for this? Thx matthias $ uname -a FreeBSD tinyCurrent 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Sun Mar 22 11:47:41 CET 2009 root@rebelion.Sisis.de:/usr/src/myHEAD/obj/usr/src/myHEAD/src/sys/GENERIC i386 $ ls -l /dev/ad* crw-r----- 1 root operator 0, 79 Mar 27 15:45 /dev/ad2 crw-r----- 1 root operator 0, 80 Mar 27 15:45 /dev/ad2s1 crw-r----- 1 root operator 0, 81 Mar 27 15:45 /dev/ad2s1a crw-r----- 1 root operator 0, 83 Mar 27 15:45 /dev/ad3 crw-r----- 1 root operator 0, 84 Mar 27 15:45 /dev/ad3a $ mount /dev/da0s1a on / (ufs, local) devfs on /dev (devfs, local) /dev/ad2s1a on /mnt (ufs, local, read-only) $ dmesg Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-CURRENT #0: Sun Mar 22 11:47:41 CET 2009 root@rebelion.Sisis.de:/usr/src/myHEAD/obj/usr/src/myHEAD/src/sys/GENERIC WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Celeron(R) M processor 900MHz (900.10-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6d8 Stepping = 8 Features=0xafe9fbff AMD Features=0x100000 real memory = 1064828928 (1015 MB) avail memory = 1024409600 (976 MB) ACPI APIC Table: ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 3f700000 (3) failed Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 acpi_ec0: port 0x62,0x66 on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 vgapci0: port 0xec00-0xec07 mem 0xf7f00000-0xf7f7ffff,0xd0000000-0xdfffffff,0xf7ec0000-0xf7efffff irq 16 at device 2.0 on pci0 agp0: on vgapci0 agp0: detected 7932k stolen memory agp0: aperture size is 256M vgapci1: mem 0xf7f80000-0xf7ffffff at device 2.1 on pci0 pci0: at device 27.0 (no driver attached) pcib1: irq 16 at device 28.0 on pci0 pci3: on pcib1 pcib2: irq 18 at device 28.2 on pci0 pci1: on pcib2 uhci0: port 0xe400-0xe41f irq 23 at device 29.0 on pci0 uhci0: [ITHREAD] uhci0: LegSup = 0x0f30 usbus0: on uhci0 uhci1: port 0xe480-0xe49f irq 19 at device 29.1 on pci0 uhci1: [ITHREAD] uhci1: LegSup = 0x0f30 usbus1: on uhci1 uhci2: port 0xe800-0xe81f irq 18 at device 29.2 on pci0 uhci2: [ITHREAD] uhci2: LegSup = 0x0f30 usbus2: on uhci2 uhci3: port 0xe880-0xe89f irq 16 at device 29.3 on pci0 uhci3: [ITHREAD] uhci3: LegSup = 0x0f30 usbus3: on uhci3 ehci0: mem 0xf7eb7c00-0xf7eb7fff irq 23 at device 29.7 on pci0 ehci0: [ITHREAD] usbus4: EHCI version 1.0 usbus4: on ehci0 pcib3: at device 30.0 on pci0 pci4: on pcib3 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.2 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 31.3 (no driver attached) acpi_lid0: on acpi0 acpi_button0: on acpi0 acpi_button1: on acpi0 acpi_tz0: on acpi0 battery0: on acpi0 acpi_acad0: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 cpu0: on acpi0 p4tcc0: on cpu0 pmtimer0 on isa0 orm0: at iomem 0xc0000-0xcf7ff pnpid ORM0000 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: parallel port not found. Timecounter "TSC" frequency 900100078 Hz quality 800 Timecounters tick every 1.000 msec usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 12Mbps Full Speed USB v1.0 usbus4: 480Mbps High Speed USB v2.0 usbus0: 12Mbps Full Speed USB v1.0 ad2: FAILURE - SET_MULTI status=51 error=4 ad2: 3847MB at ata1-master UDMA66 ugen2.1: at usbus2 uhub0: on usbus2 ugen3.1: at usbus3 uhub1: on usbus3 ugen4.1: at usbus4 uhub2: on usbus4 ugen0.1: at usbus0 uhub3: on usbus0 ugen1.1: at usbus1 uhub4: on usbus1 ad3: FAILURE - SET_MULTI status=51 error=4 ad3: 15391MB at ata1-slave UDMA66 WARNING: WITNESS option enabled, expect reduced performance. Root mount waiting for: usbus4 usbus3 usbus2 usbus1 usbus0 uhub0: 2 ports with 2 removable, self powered Root mount waiting for: usbus4 usbus3 usbus1 usbus0 uhub1: 2 ports with 2 removable, self powered Root mount waiting for: usbus4 usbus1 usbus0 uhub3: 2 ports with 2 removable, self powered Root mount waiting for: usbus4 usbus1 uhub4: 2 ports with 2 removable, self powered Root mount waiting for: usbus4 Root mount waiting for: usbus4 Root mount waiting for: usbus4 Root mount waiting for: usbus4 uhub2: 8 ports with 8 removable, self powered ugen4.2: at usbus4 umass0: on usbus4 umass0: SCSI over Bulk-Only; quirks = 0x0000 Root mount waiting for: usbus4 umass0:0:0:-1: Attached to scbus0 da0 at umass-sim0 bus 0 target 0 lun 0 da0: Removable Direct Access SCSI-2 device da0: 40.000MB/s transfers da0: 7712MB (15794176 512 byte sectors: 255H 63S/T 983C) Root mount waiting for: usbus4 ugen4.3: at usbus4 umass1: on usbus4 umass1: SCSI over Bulk-Only; quirks = 0x0000 Root mount waiting for: usbus4 umass1:1:1:-1: Attached to scbus1 (probe0:umass-sim1:1:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:umass-sim1:1:0:0): CAM Status: SCSI Status Error (probe0:umass-sim1:1:0:0): SCSI Status: Check Condition (probe0:umass-sim1:1:0:0): NOT READY asc:3a,0 (probe0:umass-sim1:1:0:0): Medium not present (probe0:umass-sim1:1:0:0): Unretryable error da1 at umass-sim1 bus 1 target 0 lun 0 da1: Removable Direct Access SCSI-0 device da1: 40.000MB/s transfers da1: Attempt to query device size failed: NOT READY, Medium not present Trying to mount root from ufs:/dev/da0s1a -- Matthias Apitz Manager Technical Support - OCLC GmbH Gruenwalder Weg 28g - 82041 Oberhaching - Germany t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211 e - w http://www.oclc.org/ http://www.UnixArea.de/ From eugen at kuzbass.ru Fri Mar 27 07:24:29 2009 From: eugen at kuzbass.ru (Eugene Grosbein) Date: Fri Mar 27 08:22:06 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) References: <17560ccf0903260551v1f5cba9eu87727c0bae7baa3@mail.gmail.com> Message-ID: <49CCDD7D.FA83BF14@kuzbass.ru> Prashant Vaibhav wrote: > The primary idea is to improve the performance and resolution of > gettimeofday() and friends by creating a efficient userspace implementation > of these functions, along with some supporting modifications to the kernel. Are you aware of CLOCK_*_FAST family of timecounters present in FreeBSD 7.x? If not, you may want to take a look: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/sys/time.h#rev1.71 Eugene Grosbein From babkin at verizon.net Fri Mar 27 08:27:13 2009 From: babkin at verizon.net (Sergey Babkin) Date: Fri Mar 27 08:38:51 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) Message-ID: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> (Sorry for the top quoting). Probably the best implementation of gettimeofd a page in the kernel mapped read-only to all the user pr the kernel's idea of time into this page. Then getting the reads, to make sure that no update The TSC can then be used to add the precis the kernel timer: i.e. remember the value of TS highest rate at which TSC may be ti would guarantee thatthe time is not moving back. However there are more issues with TS the same value on all the processors that s machine is built of multiple buses stopped, resta and clocked separately. There is no way to tell, on which CPU is th process currently runnning, and it may be rescheduled do a different C or after the RDTSC instruction. -SB Ma In message <[2]17560ccf0903260551v1f5cba9eu8 7727c0bae7baa3@mail.gmail.com>, Prasha nt Vaibhav writes: >change and use the &g kernel to I take it a functions than gettim monotonic timescale when queried Be aware that the TSC may not be, and may not stay syn across multiple cores. Further more, the TSC is not con not "known frequency" at all times. There are a lot of nasty cases to check, and a nasty interpolation speedu At the very minimum, you wi known good {CPU+MOBO+BIOS} combinatio find them. >This will also pave way f >FreeBSD kernel tickless, Rubbish. T tick or ticklessnes > - The TSC frequency might change on cert non-constant > TSC rate (because of SpeedStep, only way to > combat this is that t processor > frequency changes. updated to > notify the That is not good enough speed and the skew from not k clock changed, is a significant to make strange things happen. You will want to study carefully Dave Mills work to tame the alpha Poul-Henning [1] In my mind, rewo be a much better more neded -- Poul-Henning Kamp | [3]phk@FreeBSD.ORG | TCP FreeBSD committer | BSD since 4.3-tahoe N incompetence.< [4]freebsd-hackers@freebsd.org mailing list [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackersTo unsubscribe, send any mail to "[6]fre ebsd-hackers-unsubscribe@freebsd.org" References 1. 3D"mailto:phk@phk.freebsd.dk" 2. file://localhost/tmp/3D 3. 3D"mailto:phk@FreeBSD.ORG" 4. 3D"mailto:fre 5. 3D"http://lists.=/ 6. 3D"mailto:freebsd-hackers-unsub From danny at cs.huji.ac.il Fri Mar 27 09:36:47 2009 From: danny at cs.huji.ac.il (Danny Braniss) Date: Fri Mar 27 09:36:54 2009 Subject: amr driver broken since March 12 In-Reply-To: <49CCDA41.4060101@samsco.org> References: <49CCDA41.4060101@samsco.org> Message-ID: > Danny Braniss wrote: > > at least for me :-) > > [and sorry for the cross posting] > > > > old (March 12 , i know need the svn rev number but...) > > None of the commit activity on March 12 is jumping out at me as being > suspicious. However, you are now the second person who has told me > about AMR problems in 7.1 recently. If you have a precise svn change > number, it would help greatly. > > Scott my bad. the last working amr/iir is from March 12. I first detected the problem sometime later, but not later than March 23. So it has to be changes in that time frame. both drivers are showing similar symptoms: waiting for not busy the iir goes on for ever, and it's the cam that eventually panics, run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config (actually not 100% true, depending if WITNESS is on or off, it sometimes just hangs). the amr seems to time out: amr0: adapter is busy thanks for looking into the problem, danny From scottl at samsco.org Fri Mar 27 09:40:08 2009 From: scottl at samsco.org (Scott Long) Date: Fri Mar 27 09:43:40 2009 Subject: amr driver broken since March 12 In-Reply-To: References: <49CCDA41.4060101@samsco.org> Message-ID: <49CCF95F.1050307@samsco.org> Danny Braniss wrote: >> Danny Braniss wrote: >>> at least for me :-) >>> [and sorry for the cross posting] >>> >>> old (March 12 , i know need the svn rev number but...) >> None of the commit activity on March 12 is jumping out at me as being >> suspicious. However, you are now the second person who has told me >> about AMR problems in 7.1 recently. If you have a precise svn change >> number, it would help greatly. >> >> Scott > my bad. the last working amr/iir is from March 12. > I first detected the problem sometime later, but not later than March 23. > So it has to be changes in that time frame. > > both drivers are showing similar symptoms: > waiting for not busy > the iir goes on for ever, and it's the cam that eventually panics, > run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config > (actually not 100% true, depending if WITNESS is on or off, it sometimes > just hangs). > the amr seems to time out: > amr0: adapter is busy > > thanks for looking into the problem, > > danny > > Ok, here are a series of revisions to step through, in forward order. Make sure that you are starting with at least revision 189568. Then, update to exactly the revision numbers below, recompile the kernel, and test: 190087 190091 From lambert at lambertfam.org Fri Mar 27 09:40:12 2009 From: lambert at lambertfam.org (Scott Lambert) Date: Fri Mar 27 09:43:54 2009 Subject: amr driver broken since March 12 In-Reply-To: References: <49CCDA41.4060101@samsco.org> Message-ID: <20090327162915.GQ80292@sysmon.tcworks.net> On Fri, Mar 27, 2009 at 06:52:32PM +0300, Danny Braniss wrote: > > Danny Braniss wrote: > > > at least for me :-) > > > [and sorry for the cross posting] > > > > > > old (March 12 , i know need the svn rev number but...) > > > > None of the commit activity on March 12 is jumping out at me as being > > suspicious. However, you are now the second person who has told me > > about AMR problems in 7.1 recently. If you have a precise svn change > > number, it would help greatly. > > > > Scott (Long) > > my bad. the last working amr/iir is from March 12. > I first detected the problem sometime later, but not later than March 23. > So it has to be changes in that time frame. I think Scott Long was actually asking if you could try to cvsup (or csup) to a date between those two and see if the problem shows there. If you go for, (23 - 12/2) + 12, something like March 17, it would help to narrow what changes could be causing the problem. If you see the problem with a March 17 kernel, you can split the time between March 12 and 17 and try again. Then just keep cutting the search space in half until you can pretty much say "This is the commit that broke things for me." It's not always possible for someone to take the time to do the binary search for the actual commit which broke things for them. But when they can, it really helps the developers. Just cutting it down from 11 days to 5 or 6 days can probably be a big help. -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org From onemda at gmail.com Fri Mar 27 09:52:42 2009 From: onemda at gmail.com (Paul B. Mahol) Date: Fri Mar 27 09:52:49 2009 Subject: CURRENT sees only /dev/ad2s1a, but not /dev/ad3s1a In-Reply-To: <20090327151052.GA13243@rebelion.Sisis.de> References: <20090327151052.GA13243@rebelion.Sisis.de> Message-ID: <3a142e750903270952h3ba5e28fp72b39283b2a46d97@mail.gmail.com> On 3/27/09, Matthias Apitz wrote: > > Hello, > > When I boot my EeePC from USB key (/dev/da0s1a) -CURRENT it sees the two SSD > only > as > > $ ls -l /dev/ad* > /dev/ad2 > /dev/ad2s1 > /dev/ad2s1a > /dev/ad3 > /dev/ad3a > > I can mount /dev/ad2s1a but ofc not /dev/ad3s1a; > > when I'm booting the RELENG_7 from /dev/ad2s1a itself it looks like this: > > $ mount > /dev/ad2s1a on / (ufs, local, noatime) > /dev/ad3s1a on /usr/home (ufs, local, noatime) CURRENT have replaced geom_bsd with geom_part_bsd and that can cause various problems, search current archives for more info. -- Paul From scottl at samsco.org Fri Mar 27 09:51:40 2009 From: scottl at samsco.org (Scott Long) Date: Fri Mar 27 10:02:17 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> Message-ID: <49CD0405.1060704@samsco.org> I've been talking about this for years. All I need is help with the VM magic to create the page on fork. I also want two pages, one global for gettimeofday (and any other global data we can think of) and one per-process for static data like getpid/getgid. Scott Sergey Babkin wrote: > (Sorry for the top quoting). Probably the best implementation of > gettimeofd=y() is to have > a page in the kernel mapped read-only to all the user pr=cesses. Put > the kernel's idea of time > into this page. Then getting the =ime becomes a simple read (OK, two > reads, to make sure that > no update =as happened in between). > The TSC can then be used to add the precis=on between the ticks of > the kernel timer: > i.e. remember the value of TS= when the last tick happen, and the > highest rate at which > TSC may be ti=king at this CPU, and export in the same page. This > would guarantee thatthe time is not moving back. > However there are more issues with TS=. TSC is guaranteed to have > the same value > on all the processors that s=are the same system bus. But if the > machine is built of multiple > buses =ith bridges between them, all bets are off. Each bus may be > stopped, resta=ted > and clocked separately. There is no way to tell, on which CPU is th= > process currently > runnning, and it may be rescheduled do a different C=U right before > or after the RDTSC > instruction. > -SB > Ma= 26, 2009 06:55:04 PM, [1]phk@phk.freebsd.dk wrote: > > In message <[2]17560ccf0903260551v1f5cba9eu8 7727c0bae7baa3@mail.gmail.com>, Prasha > nt Vaibhav writes: > =The gettimeofday() function's implementation will then be > >change= to read the timestamp counter (TSC) from the processor, > and use the > &g=;reading in conjunction with the timing info exported by the > kernel to > =calculate and return the time info in proper format. > I take it a= read, that you know that there are other relvant > functions than gettim=ofday() and that these must provide a > monotonic timescale when queried =nterleaved ? > Be aware that the TSC may not be, and may not stay syn=hronized > across multiple cores. > Further more, the TSC is not con=tant frequency and in particular > not "known frequency" at all times. > There are a lot of nasty cases to check, and a nasty interpolation > =equired, which, in my tests some years back, totally negated any > speedu= from using the TSC in the first place. > At the very minimum, you wi=l have to add a quirk table where > known good {CPU+MOBO+BIOS} combinatio=s can be entered, as we > find them. > >This will also pave way f=r optionally making the > >FreeBSD kernel tickless, > Rubbish. T=mecounters are not even closely associated with the > tick or ticklessnes= of the kernel. [1] > > - The TSC frequency might change on cert=in processors with > non-constant > > TSC rate (because of SpeedStep, =ynamic freq scaling etc.). The > only way to > > combat this is that t=e kernel be notified every time the > processor > > frequency changes.=very cpu frequency driver will need to be > updated to > > notify the=ernel before and after a cpu freq change. > That is not good enough= the bios may autonomously change the cpu > speed > and the skew from not k=owing exactly _when_ and _how_ the cpu > clock > changed, is a significant =umber of microseconds, plenty of time > to make strange things happen. > You will want to study carefully Dave Mills work to tame the alpha > =hips wandering SAW clocks. > Poul-Henning > [1] In my mind, rewo=king the callout system in the kernel would > be a much better more neded=nd much more worthwhile project. > -- > Poul-Henning Kamp | =NIX since Zilog Zeus 3.20 > [3]phk@FreeBSD.ORG | TCP=IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > N=ver attribute to malice what can adequately be explained by > incompetence.<=r>_______________________________________________ > [4]freebsd-hackers@freebsd.org mailing list > [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackersTo > unsubscribe, send any mail to "[6]fre ebsd-hackers-unsubscribe@freebsd.org" > > > References > > 1. 3D"mailto:phk@phk.freebsd.dk" > 2. file://localhost/tmp/3D 3. 3D"mailto:phk@FreeBSD.ORG" > 4. 3D"mailto:fre 5. 3D"http://lists.=/ > 6. 3D"mailto:freebsd-hackers-unsub_______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" From phk at phk.freebsd.dk Fri Mar 27 10:31:30 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri Mar 27 10:31:38 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: Your message of "Fri, 27 Mar 2009 10:51:17 CST." <49CD0405.1060704@samsco.org> Message-ID: <5739.1238175087@critter.freebsd.dk> In message <49CD0405.1060704@samsco.org>, Scott Long writes: >I've been talking about this for years. All I need is help with the VM >magic to create the page on fork. I also want two pages, one global >for gettimeofday (and any other global data we can think of) and one >per-process for static data like getpid/getgid. Agreed, that is a good place to start. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From pisymbol at gmail.com Fri Mar 27 11:19:18 2009 From: pisymbol at gmail.com (Alexander Sack) Date: Fri Mar 27 11:19:25 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <5739.1238175087@critter.freebsd.dk> References: <49CD0405.1060704@samsco.org> <5739.1238175087@critter.freebsd.dk> Message-ID: <3c0b01820903271119l4161c7b8yf74613b184add487@mail.gmail.com> On Fri, Mar 27, 2009 at 1:31 PM, Poul-Henning Kamp wrote: > In message <49CD0405.1060704@samsco.org>, Scott Long writes: > >>I've been talking about this for years. ?All I need is help with the VM >>magic to create the page on fork. ?I also want two pages, one global >>for gettimeofday (and any other global data we can think of) and one >>per-process for static data like getpid/getgid. > > Agreed, that is a good place to start. I'm assuming folks are still in love with the TSC because it still the cheapest as oppose ACPI-fast or HPET to even contemplate this? Also I thought at least PHK's comment (Sergey mentioned it) was true regardless of bus, that the TSC is not consistent across multiple packages (and for that matter I suppose cores) due to I *think* its ISA lineage so how does this work again? Won't the rate in which you tick up be sporadic over the course of the process scheduled on different cores? (i.e. depending on what core RDTSC happened to land on) -aps From rwatson at FreeBSD.org Fri Mar 27 11:24:00 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Mar 27 11:24:13 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD0405.1060704@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> Message-ID: On Fri, 27 Mar 2009, Scott Long wrote: > I've been talking about this for years. All I need is help with the VM > magic to create the page on fork. I also want two pages, one global for > gettimeofday (and any other global data we can think of) and one per-process > for static data like getpid/getgid. FWIW, there are some variations in schemes across OS's -- one extreme is the Linux approach, which actually exports a mini shared library in ELF format on the shared page, providing implementations of various services (such as entering system calls), time stuff, etc. Less extreme are the shared pages offered on Mac OS X, etc. Robert N M Watson Computer Laboratory University of Cambridge > > Scott > > > Sergey Babkin wrote: >> (Sorry for the top quoting). Probably the best implementation of >> gettimeofd=y() is to have >> a page in the kernel mapped read-only to all the user pr=cesses. Put >> the kernel's idea of time >> into this page. Then getting the =ime becomes a simple read (OK, two >> reads, to make sure that >> no update =as happened in between). >> The TSC can then be used to add the precis=on between the ticks of >> the kernel timer: >> i.e. remember the value of TS= when the last tick happen, and the >> highest rate at which >> TSC may be ti=king at this CPU, and export in the same page. This >> would guarantee thatthe time is not moving back. >> However there are more issues with TS=. TSC is guaranteed to have >> the same value >> on all the processors that s=are the same system bus. But if the >> machine is built of multiple >> buses =ith bridges between them, all bets are off. Each bus may be >> stopped, resta=ted >> and clocked separately. There is no way to tell, on which CPU is th= >> process currently >> runnning, and it may be rescheduled do a different C=U right before >> or after the RDTSC >> instruction. >> -SB >> Ma= 26, 2009 06:55:04 PM, [1]phk@phk.freebsd.dk wrote: >> In message <[2]17560ccf0903260551v1f5cba9eu8 >> 7727c0bae7baa3@mail.gmail.com>, Prasha >> nt Vaibhav writes: >> =The gettimeofday() function's implementation will then be >> >change= to read the timestamp counter (TSC) from the processor, >> and use the >> &g=;reading in conjunction with the timing info exported by the >> kernel to >> =calculate and return the time info in proper format. >> I take it a= read, that you know that there are other relvant >> functions than gettim=ofday() and that these must provide a >> monotonic timescale when queried =nterleaved ? >> Be aware that the TSC may not be, and may not stay syn=hronized >> across multiple cores. >> Further more, the TSC is not con=tant frequency and in particular >> not "known frequency" at all times. >> There are a lot of nasty cases to check, and a nasty interpolation >> =equired, which, in my tests some years back, totally negated any >> speedu= from using the TSC in the first place. >> At the very minimum, you wi=l have to add a quirk table where >> known good {CPU+MOBO+BIOS} combinatio=s can be entered, as we >> find them. >> >This will also pave way f=r optionally making the >> >FreeBSD kernel tickless, >> Rubbish. T=mecounters are not even closely associated with the >> tick or ticklessnes= of the kernel. [1] >> > - The TSC frequency might change on cert=in processors with >> non-constant >> > TSC rate (because of SpeedStep, =ynamic freq scaling etc.). The >> only way to >> > combat this is that t=e kernel be notified every time the >> processor >> > frequency changes.=very cpu frequency driver will need to be >> updated to >> > notify the=ernel before and after a cpu freq change. >> That is not good enough= the bios may autonomously change the cpu >> speed >> and the skew from not k=owing exactly _when_ and _how_ the cpu >> clock >> changed, is a significant =umber of microseconds, plenty of time >> to make strange things happen. >> You will want to study carefully Dave Mills work to tame the alpha >> =hips wandering SAW clocks. >> Poul-Henning >> [1] In my mind, rewo=king the callout system in the kernel would >> be a much better more neded=nd much more worthwhile project. >> -- >> Poul-Henning Kamp | =NIX since Zilog Zeus 3.20 >> [3]phk@FreeBSD.ORG | TCP=IP since RFC 956 >> FreeBSD committer | BSD since 4.3-tahoe >> N=ver attribute to malice what can adequately be explained by >> incompetence.<=r>_______________________________________________ >> [4]freebsd-hackers@freebsd.org mailing list >> [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackersTo >> unsubscribe, send any mail to "[6]fre >> ebsd-hackers-unsubscribe@freebsd.org" >> >> References >> >> 1. 3D"mailto:phk@phk.freebsd.dk" >> 2. file://localhost/tmp/3D 3. 3D"mailto:phk@FreeBSD.ORG" >> 4. 3D"mailto:fre 5. 3D"http://lists.=/ >> 6. >> 3D"mailto:freebsd-hackers-unsub_______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From rwatson at FreeBSD.org Fri Mar 27 11:25:15 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Mar 27 11:25:33 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD0405.1060704@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> Message-ID: On Fri, 27 Mar 2009, Scott Long wrote: > I've been talking about this for years. All I need is help with the VM > magic to create the page on fork. I also want two pages, one global for > gettimeofday (and any other global data we can think of) and one per-process > for static data like getpid/getgid. One note though -- the time to do the global page is at execve()-time. Robert N M Watson Computer Laboratory University of Cambridge > > Scott > > > Sergey Babkin wrote: >> (Sorry for the top quoting). Probably the best implementation of >> gettimeofd=y() is to have >> a page in the kernel mapped read-only to all the user pr=cesses. Put >> the kernel's idea of time >> into this page. Then getting the =ime becomes a simple read (OK, two >> reads, to make sure that >> no update =as happened in between). >> The TSC can then be used to add the precis=on between the ticks of >> the kernel timer: >> i.e. remember the value of TS= when the last tick happen, and the >> highest rate at which >> TSC may be ti=king at this CPU, and export in the same page. This >> would guarantee thatthe time is not moving back. >> However there are more issues with TS=. TSC is guaranteed to have >> the same value >> on all the processors that s=are the same system bus. But if the >> machine is built of multiple >> buses =ith bridges between them, all bets are off. Each bus may be >> stopped, resta=ted >> and clocked separately. There is no way to tell, on which CPU is th= >> process currently >> runnning, and it may be rescheduled do a different C=U right before >> or after the RDTSC >> instruction. >> -SB >> Ma= 26, 2009 06:55:04 PM, [1]phk@phk.freebsd.dk wrote: >> In message <[2]17560ccf0903260551v1f5cba9eu8 >> 7727c0bae7baa3@mail.gmail.com>, Prasha >> nt Vaibhav writes: >> =The gettimeofday() function's implementation will then be >> >change= to read the timestamp counter (TSC) from the processor, >> and use the >> &g=;reading in conjunction with the timing info exported by the >> kernel to >> =calculate and return the time info in proper format. >> I take it a= read, that you know that there are other relvant >> functions than gettim=ofday() and that these must provide a >> monotonic timescale when queried =nterleaved ? >> Be aware that the TSC may not be, and may not stay syn=hronized >> across multiple cores. >> Further more, the TSC is not con=tant frequency and in particular >> not "known frequency" at all times. >> There are a lot of nasty cases to check, and a nasty interpolation >> =equired, which, in my tests some years back, totally negated any >> speedu= from using the TSC in the first place. >> At the very minimum, you wi=l have to add a quirk table where >> known good {CPU+MOBO+BIOS} combinatio=s can be entered, as we >> find them. >> >This will also pave way f=r optionally making the >> >FreeBSD kernel tickless, >> Rubbish. T=mecounters are not even closely associated with the >> tick or ticklessnes= of the kernel. [1] >> > - The TSC frequency might change on cert=in processors with >> non-constant >> > TSC rate (because of SpeedStep, =ynamic freq scaling etc.). The >> only way to >> > combat this is that t=e kernel be notified every time the >> processor >> > frequency changes.=very cpu frequency driver will need to be >> updated to >> > notify the=ernel before and after a cpu freq change. >> That is not good enough= the bios may autonomously change the cpu >> speed >> and the skew from not k=owing exactly _when_ and _how_ the cpu >> clock >> changed, is a significant =umber of microseconds, plenty of time >> to make strange things happen. >> You will want to study carefully Dave Mills work to tame the alpha >> =hips wandering SAW clocks. >> Poul-Henning >> [1] In my mind, rewo=king the callout system in the kernel would >> be a much better more neded=nd much more worthwhile project. >> -- >> Poul-Henning Kamp | =NIX since Zilog Zeus 3.20 >> [3]phk@FreeBSD.ORG | TCP=IP since RFC 956 >> FreeBSD committer | BSD since 4.3-tahoe >> N=ver attribute to malice what can adequately be explained by >> incompetence.<=r>_______________________________________________ >> [4]freebsd-hackers@freebsd.org mailing list >> [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackersTo >> unsubscribe, send any mail to "[6]fre >> ebsd-hackers-unsubscribe@freebsd.org" >> >> References >> >> 1. 3D"mailto:phk@phk.freebsd.dk" >> 2. file://localhost/tmp/3D 3. 3D"mailto:phk@FreeBSD.ORG" >> 4. 3D"mailto:fre 5. 3D"http://lists.=/ >> 6. >> 3D"mailto:freebsd-hackers-unsub_______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From scottl at samsco.org Fri Mar 27 11:30:28 2009 From: scottl at samsco.org (Scott Long) Date: Fri Mar 27 12:34:32 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> Message-ID: <49CD1B3D.3030103@samsco.org> Robert Watson wrote: > > On Fri, 27 Mar 2009, Scott Long wrote: > >> I've been talking about this for years. All I need is help with the >> VM magic to create the page on fork. I also want two pages, one >> global for gettimeofday (and any other global data we can think of) >> and one per-process for static data like getpid/getgid. > > FWIW, there are some variations in schemes across OS's -- one extreme is > the Linux approach, which actually exports a mini shared library in ELF > format on the shared page, providing implementations of various services > (such as entering system calls), time stuff, etc. Less extreme are the > shared pages offered on Mac OS X, etc. > Yes, but I'd like to start somewhere, and considering that it's been impossible in _5_ years to get the 30 minutes of Peter or JeffR or JHB time to get the basic VM magic done, I'm keeping my expectations as modest as possible. Scott From julian at elischer.org Fri Mar 27 13:02:31 2009 From: julian at elischer.org (Julian Elischer) Date: Fri Mar 27 13:02:43 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD0405.1060704@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> Message-ID: <49CD30E9.7030501@elischer.org> Scott Long wrote: > I've been talking about this for years. All I need is help with the VM > magic to create the page on fork. I also want two pages, one global > for gettimeofday (and any other global data we can think of) and one > per-process for static data like getpid/getgid. interestingly it is even feasible to have a per-thread page.. it requires that the scheduler change a page table entry tough. From julian at elischer.org Fri Mar 27 13:08:15 2009 From: julian at elischer.org (Julian Elischer) Date: Fri Mar 27 13:08:21 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD1B3D.3030103@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD1B3D.3030103@samsco.org> Message-ID: <49CD3242.8050802@elischer.org> Scott Long wrote: > Robert Watson wrote: >> >> On Fri, 27 Mar 2009, Scott Long wrote: >> >>> I've been talking about this for years. All I need is help with the >>> VM magic to create the page on fork. I also want two pages, one >>> global for gettimeofday (and any other global data we can think of) >>> and one per-process for static data like getpid/getgid. >> >> FWIW, there are some variations in schemes across OS's -- one extreme >> is the Linux approach, which actually exports a mini shared library in >> ELF format on the shared page, providing implementations of various >> services (such as entering system calls), time stuff, etc. Less >> extreme are the shared pages offered on Mac OS X, etc. >> > > Yes, but I'd like to start somewhere, and considering that it's been > impossible in _5_ years to get the 30 minutes of Peter or JeffR or JHB > time to get the basic VM magic done, I'm keeping my expectations as > modest as possible. try alc.. :-) > > Scott > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" From cswiger at mac.com Fri Mar 27 13:16:05 2009 From: cswiger at mac.com (Chuck Swiger) Date: Fri Mar 27 13:21:53 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD1B3D.3030103@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD1B3D.3030103@samsco.org> Message-ID: <2C3C7185-CB37-4067-B2A9-A03B5B288606@mac.com> Hi, Scott & all-- On Mar 27, 2009, at 11:30 AM, Scott Long wrote: > Robert Watson wrote: >> On Fri, 27 Mar 2009, Scott Long wrote: >>> I've been talking about this for years. All I need is help with >>> the VM magic to create the page on fork. I also want two pages, >>> one global for gettimeofday (and any other global data we can >>> think of) and one per-process for static data like getpid/getgid. >> FWIW, there are some variations in schemes across OS's -- one >> extreme is the Linux approach, which actually exports a mini shared >> library in ELF format on the shared page, providing implementations >> of various services (such as entering system calls), time stuff, >> etc. Less extreme are the shared pages offered on Mac OS X, etc. > > Yes, but I'd like to start somewhere, and considering that it's been > impossible in _5_ years to get the 30 minutes of Peter or JeffR or JHB > time to get the basic VM magic done, I'm keeping my expectations as > modest as possible. I'm not entirely sure how close the Mach/xnu and FreeBSD implementations of pmap_* stuff are, but the xnu code for commpage stuff is here: http://www.opensource.apple.com/darwinsource/Current/xnu-1228.9.59/osfmk/i386/pmap.c [pmap_commpage32_init(), pmap_commpage64_init()] http://www.opensource.apple.com/darwinsource/Current/xnu-1228.9.59/osfmk/i386/commpage/ [all :-)] http://www.opensource.apple.com/darwinsource/Current/xnu-1228.9.59/osfmk/i386/commpage/commpage_gettimeofday.s [but this one in particular] http://www.opensource.apple.com/darwinsource/Current/xnu-1228.9.59/osfmk/vm/vm_shared_region.c [cf "COMM PAGE" comments, vm_commpage_init()] http://www.opensource.apple.com/darwinsource/Current/xnu-1228.9.59/bsd/kern/kern_fork.c [fork_create_child(), procdup(), uses of pmap_map_sharedpage()] [ ADC login might be needed, otherwise I think rwatson has been importing xnu periodically for TrustedBSD or other work, and might be able to provide similar pointers... ] Regards, -- -Chuck From pisymbol at gmail.com Fri Mar 27 13:23:28 2009 From: pisymbol at gmail.com (Alexander Sack) Date: Fri Mar 27 13:23:34 2009 Subject: Building a DDB friendly kernel/drivers? Message-ID: <3c0b01820903271323t1376671enf7be1febd30113be@mail.gmail.com> Hi Folks: I'm debugging an issue with a third-party driver that causes an NMI during driver initialization. It only occurs for one version of the driver thus far. I want to isolate what triggers the NMI and generally get a feel for the initialization of the hardware. I'm running a 6.x-amd64 kernel. Can someone explain to me why when I compile the kernel with default debugging options (makeoptions -g, options DDB/KDB etc. etc.) the kernel comes and boots BUT if I remove -O2 and -frename-registers (in an effort to make text even close to readible), the kernel boots and then double-faults on mounting root? I guess more importantly, what's the RIGHT way to build a DDB/KDB friendly kernel? I thought -O2 and/or -frename-registers could cause DDB to act up but perhaps I'm wrong. Thanks! -aps From babkin at verizon.net Fri Mar 27 14:14:18 2009 From: babkin at verizon.net (Sergey Babkin) Date: Fri Mar 27 14:52:31 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) Message-ID: <33531707.21385.1238188446396.JavaMail.root@vms074.mailsrvcs.net> Would not a normal mmap be duplicated on fork? I'd do it as a small pseudo- that allows to mmap this page. Then libc would open this pseudo-d evice and mmap it, either in the on-load handler or on the first call of gettimeofday(). I think, that should be it, no special magic nece The per-process is more difficult and would require the magic maybe no magic a s such: just mmap the file from the /proc files Then on fork in the child unmap this page, open the new file, and will still be tricky :-) It also means wasting an extra p -SB Mar 27, 2009 12:51:56 PM, [1]scottl@samsc I've been talking about this for years. All I need is help with the VM magic to create the page on fork. I also want two pages, one gl obal for gettimeofday (and any other global data we can think of) and on per-process for static data like getpid/getgid. Scott Sergey Babkin wrote: > (Sorry for the top quoting). Probably the > gettimeofd=y() is to have > a pr=cesses. Put &g > into this page. Then getting the (OK, two > reads, to make sure that< between). References 1. file://localhost/tmp/3D"mai From rwatson at FreeBSD.org Fri Mar 27 15:59:40 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Mar 27 15:59:49 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <33531707.21385.1238188446396.JavaMail.root@vms074.mailsrvcs.net> References: <33531707.21385.1238188446396.JavaMail.root@vms074.mailsrvcs.net> Message-ID: On Fri, 27 Mar 2009, Sergey Babkin wrote: > Would not a normal mmap be duplicated on fork? I'd do it as a small > pseudo-= driver > that allows to mmap this page. Then libc would open this pseudo-d= > evice and mmap it, > either in the on-load handler or on the first call of= > gettimeofday(). I think, that should > be it, no special magic nece= ssary. > The per-process is more difficult and would require the magic= :-) Or > maybe > no magic a s such: just mmap the file from the /proc files= ystem. > Then on fork > in the child unmap this page, open the new file, and= map it. vfork > will still be tricky :-) > It also means wasting an extra p= age per process. Part of the point of mapping in the page at execve()-time, or fork()-time for per-process pages (which I'm not entirely convinced we need yet) is to avoid the cost of an extra device open, mmap, etc, for every execve(), which can be quite expensive. I stuck a prototype page mapped from a special device exporting time information here a year or two ago: http://www.watson.org/~robert/freebsd/20080203-evilmem.diff http://www.watson.org/~robert/freebsd/evilmem_test.c This doesn't do TSC-based adjustment, just drops a timestamp in from the callout wheel, but was intended to allow Kris to do a bit of comparative benchmarking and decide if it might be a viable approach to invest further work in. Obviously, the above code should never, ever, get near a production kernel, since it was a 2-hour hack for experimental purposes. I think the right way forward is to prototype: map the page in at execve()-time in the kernel and pass the address to rtld via elf auxiliary arguments, and have rtld link it (via some or another means), exposing symbols or code or whatever, to libc. If someone wants to make it a dynamic shared object in ELF-speak, then I'm all for that as it would minimize the work rtld had to do. I guess interesting questions are whether (a) it would be desirable to have per-page, per-cpu, or per-thread mappings. If there are non-synchronized TSCs, then there might be some interesting advantages to a per-CPU page. Robert N M Watson Computer Laboratory University of Cambridge > -SB > Mar 27, 2009 12:51:56 PM, [1]scottl@samsc= o.org wrote: > > I've been talking about this for years. All I need is help with = > the VM > magic to create the page on fork. I also want two pages, one gl= > obal > for gettimeofday (and any other global data we can think of) and > on= e > per-process for static data like getpid/getgid. > Scott > Sergey Babkin wrote: > > (Sorry for the top quoting). Probably the= best implementation of > > gettimeofd=3Dy() is to have > > a= page in the kernel mapped read-only to all the user > pr=3Dcesses. Put > &g= t; the kernel's idea of time > > into this page. Then getting the= =3Dime becomes a simple read > (OK, two > > reads, to make sure that<= br>> no update =3Das happened in > between). > > > References > > 1. file://localhost/tmp/3D"mai= > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From phk at phk.freebsd.dk Fri Mar 27 16:02:06 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri Mar 27 16:02:31 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: Your message of "Fri, 27 Mar 2009 22:59:39 GMT." Message-ID: <7319.1238194922@critter.freebsd.dk> In message , Robert Wats on writes: >I guess interesting questions are whether (a) it would be desirable to have >per-page, per-cpu, or per-thread mappings. If there are non-synchronized >TSCs, then there might be some interesting advantages to a per-CPU page. Rule #3: The only thing worse than generalizing from one example is generalizing from no examples at all. We can add those mappings when we know why we would want them. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From rwatson at FreeBSD.org Fri Mar 27 16:05:37 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Mar 27 16:05:51 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <7319.1238194922@critter.freebsd.dk> References: <7319.1238194922@critter.freebsd.dk> Message-ID: On Fri, 27 Mar 2009, Poul-Henning Kamp wrote: > In message , Robert Wats > on writes: > >> I guess interesting questions are whether (a) it would be desirable to have >> per-page, per-cpu, or per-thread mappings. If there are non-synchronized >> TSCs, then there might be some interesting advantages to a per-CPU page. > > Rule #3: > The only thing worse than generalizing from one example is > generalizing from no examples at all. > > We can add those mappings when we know why we would want them. If we believe TSCs won't be synchronized, and don't want to synchronize them ourselves, then we'll need different mapping state to get from a TSC stamp to a time on different CPUs. In which case user application threads will need to know their CPU in order to use the right conversion data (ideally without a system call, since that's part of what we're avoiding here), or use a per-CPU mapping and not know (in which case they'll need to detect and handle the very rare "preempted and migrated between read TSC and read conversion data" race). I'm not pushing a per-CPU page, but there would be some interesting advantages to supporting that. Robert N M Watson Computer Laboratory University of Cambridge From phk at phk.freebsd.dk Fri Mar 27 16:10:40 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri Mar 27 16:10:46 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: Your message of "Fri, 27 Mar 2009 23:05:34 GMT." Message-ID: <7362.1238195438@critter.freebsd.dk> In message , Robert Wats on writes: >In which case user application threads will need to >know their CPU [...] Didn't jemalloc solve that problem once already ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From rwatson at FreeBSD.org Fri Mar 27 17:08:43 2009 From: rwatson at FreeBSD.org (Robert Watson) Date: Fri Mar 27 17:14:42 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <7362.1238195438@critter.freebsd.dk> References: <7362.1238195438@critter.freebsd.dk> Message-ID: On Fri, 27 Mar 2009, Poul-Henning Kamp wrote: > In message , Robert > Wats on writes: > >> In which case user application threads will need to know their CPU [...] > > Didn't jemalloc solve that problem once already ? I think jemalloc implements thread-affinity for arenas rather than CPU-affinity in the strict sense, but I may misread. Robert N M Watson Computer Laboratory University of Cambridge From jasone at FreeBSD.org Fri Mar 27 17:45:28 2009 From: jasone at FreeBSD.org (Jason Evans) Date: Fri Mar 27 17:45:35 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: References: <7362.1238195438@critter.freebsd.dk> Message-ID: <49CD6E90.6050303@FreeBSD.org> Robert Watson wrote: > On Fri, 27 Mar 2009, Poul-Henning Kamp wrote: >> In message , >> Robert Wats on writes: >> >>> In which case user application threads will need to know their CPU [...] >> >> Didn't jemalloc solve that problem once already ? > > I think jemalloc implements thread-affinity for arenas rather than > CPU-affinity in the strict sense, but I may misread. CPU affinity is of limited use to malloc unless it can safely pin threads to CPUs. Unfortunately, malloc cannot muck with CPU affinity, since that's up to the application. Therefore, as you say, jemalloc implements (dynamically balanced) arena affinity. It might work okay in practice to use the current CPU ID to decide which arena to use, if the scheduler does not often migrate running processes. I haven't explored that possibility though, since the infrastructure for cheaply querying the CPU ID doesn't currently (to my knowledge) exist. Jason From samflanker at gmail.com Sat Mar 28 04:38:46 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Sat Mar 28 04:38:52 2009 Subject: [problem] aac0 does not respond Message-ID: 2009/3/24 Vladimir Ermakov : >Hello, All > >Describe my problem: >have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 >2 HHD of 6 have errors in smart data (damaged) >i am try read file /var/db/mysql/ibdata1 from this volume >system does not respond ( lost access to ssh ) after read 6GB data >from this >file >and print debug messages on ttyv0 hi tried load FreeBSD 6.3-RELEASE i386, mount ufs partition from this volume and read the file --------------------------------- # cat ibdata1 > /dev/null # echo $? 0 # --------------------------------- on FreeBSD 6.3-RELEASE i386 this problem does not /Vladimir Ermakov From astrodog at gmail.com Sat Mar 28 06:10:15 2009 From: astrodog at gmail.com (Astrodog) Date: Sat Mar 28 06:20:01 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD6E90.6050303@FreeBSD.org> References: <7362.1238195438@critter.freebsd.dk> <49CD6E90.6050303@FreeBSD.org> Message-ID: <2fd864e0903280546r5eb7ae8avc96fd2d21cac1a7e@mail.gmail.com> On Fri, Mar 27, 2009 at 7:25 PM, Jason Evans wrote: > Robert Watson wrote: >> >> On Fri, 27 Mar 2009, Poul-Henning Kamp wrote: >>> >>> In message , >>> Robert Wats on writes: >>> >>>> In which case user application threads will need to know their CPU [...] >>> >>> Didn't jemalloc solve that problem once already ? >> >> I think jemalloc implements thread-affinity for arenas rather than >> CPU-affinity in the strict sense, but I may misread. > > CPU affinity is of limited use to malloc unless it can safely pin threads to > CPUs. ?Unfortunately, malloc cannot muck with CPU affinity, since that's up > to the application. ?Therefore, as you say, jemalloc implements (dynamically > balanced) arena affinity. > > It might work okay in practice to use the current CPU ID to decide which > arena to use, if the scheduler does not often migrate running processes. ?I > haven't explored that possibility though, since the infrastructure for > cheaply querying the CPU ID doesn't currently (to my knowledge) exist. > > Jason Hopefully, this is a more reasonable CC list, yet will still get to everyone... First, re: scottl's creating pages on fork, I might be able to do that. I'll give it a shot when I get back to my machine at home, and let you know if it either works, or just blows up in my face, and causes the usual brain melt I get when I poke at VM stuff. As far as thread CPU affinity goes, as I understand things, this is implemented in sched_ule, and one could certainly make a version of malloc that takes advantage of this... however, locking a thread to a CPU has some pretty significant side effects. Even if you only lock running/runnable threads to a CPU, you could end up with some horribly unbalanced scheduling, depending entirely on the load of the machine when the threads are started, and pinned, that the scheduler cannot balance, even on a system with moderate load, which would probably hurt performance more than most things one could do trying to get an accurate, fast timer. If nothing else, it'd be a nightmare on machines with intermittent high load, and it'd produce fairly inconsistent performance. For cheaply getting the current CPUID, if there's actual demand for this information in userland applications, it should be fairly easy to add to the scheduler, assuming it doesn't already exist. If JeffR, etc doesn't have time, let me know and I'll crank out a patch. --- Harrison Grundy From danny at cs.huji.ac.il Sat Mar 28 12:04:22 2009 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sat Mar 28 12:04:40 2009 Subject: amr driver broken since March 12 In-Reply-To: <49CCF95F.1050307@samsco.org> References: <49CCDA41.4060101@samsco.org> <49CCF95F.1050307@samsco.org> Message-ID: > Danny Braniss wrote: > >> Danny Braniss wrote: > >>> at least for me :-) > >>> [and sorry for the cross posting] > >>> > >>> old (March 12 , i know need the svn rev number but...) > >> None of the commit activity on March 12 is jumping out at me as being > >> suspicious. However, you are now the second person who has told me > >> about AMR problems in 7.1 recently. If you have a precise svn change > >> number, it would help greatly. > >> > >> Scott > > my bad. the last working amr/iir is from March 12. > > I first detected the problem sometime later, but not later than March 23. > > So it has to be changes in that time frame. > > > > both drivers are showing similar symptoms: > > waiting for not busy > > the iir goes on for ever, and it's the cam that eventually panics, > > run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config > > (actually not 100% true, depending if WITNESS is on or off, it sometimes > > just hangs). > > the amr seems to time out: > > amr0: adapter is busy > > > > thanks for looking into the problem, > > > > danny > > > > > > Ok, here are a series of revisions to step through, in forward order. > Make sure that you are starting with at least revision 189568. Then, > update to exactly the revision numbers below, recompile the kernel, and > test: > > 190087 > 190091 > it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. cheers, danny From davidxu at freebsd.org Sat Mar 28 17:35:54 2009 From: davidxu at freebsd.org (David Xu) Date: Sat Mar 28 17:36:07 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD30E9.7030501@elischer.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> Message-ID: <49CEC261.4010803@freebsd.org> Julian Elischer wrote: > Scott Long wrote: >> I've been talking about this for years. All I need is help with the >> VM magic to create the page on fork. I also want two pages, one global >> for gettimeofday (and any other global data we can think of) and one >> per-process for static data like getpid/getgid. > > interestingly it is even feasible to have a per-thread page.. > it requires that the scheduler change a page table entry tough. > I will knock his door at midnight if he added such a heavy weight task in the scheduler, TLB shutdown is horrible, and big code size squeezing out data from CPU cache is not idea model. scheduler should be as simple as just a context switching routine. :-) David Xu From ssouhlal at FreeBSD.org Sat Mar 28 19:17:34 2009 From: ssouhlal at FreeBSD.org (Suleiman Souhlal) Date: Sat Mar 28 19:17:47 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD1B3D.3030103@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD1B3D.3030103@samsco.org> Message-ID: <04EDFED9-24B4-404C-96F7-2C96FBC300B4@FreeBSD.org> On Mar 27, 2009, at 11:30 AM, Scott Long wrote: > Robert Watson wrote: >> On Fri, 27 Mar 2009, Scott Long wrote: >>> I've been talking about this for years. All I need is help with >>> the VM magic to create the page on fork. I also want two pages, >>> one global for gettimeofday (and any other global data we can >>> think of) and one per-process for static data like getpid/getgid. >> FWIW, there are some variations in schemes across OS's -- one >> extreme is the Linux approach, which actually exports a mini >> shared library in ELF format on the shared page, providing >> implementations of various services (such as entering system >> calls), time stuff, etc. Less extreme are the shared pages >> offered on Mac OS X, etc. > > Yes, but I'd like to start somewhere, and considering that it's been > impossible in _5_ years to get the 30 minutes of Peter or JeffR or JHB > time to get the basic VM magic done, I'm keeping my expectations as > modest as possible. > You can find a proof-of-concept implementation for amd64 of a global page mapped in every process at http://people.freebsd.org/~ssouhlal/ testing/syspage-20090328.diff . It exports ticks to userland at VM_MIN_KERNEL_ADDRESS (0xfffffffe40000000). In order for this to work on architectures without a direct map, the page will need to be mapped a second time as read/write (you might want to have a vm_offset_t pmap_map_syspage(vm_page_t m) function that does the right thing for each architecture). Unfortunately, this trick probably won't work for per-process pages without more work, because we wouldn't be able to just insert the page in kernel_map. -- Suleiman From scottl at samsco.org Sat Mar 28 16:45:03 2009 From: scottl at samsco.org (Scott Long) Date: Sat Mar 28 21:40:17 2009 Subject: amr driver broken since March 12 In-Reply-To: References: <49CCDA41.4060101@samsco.org> <49CCF95F.1050307@samsco.org> Message-ID: <49CEB652.8060003@samsco.org> Danny Braniss wrote: > it seems March 12 was a bit off :-) > it took some time, but I managed to close the gap: > 189100 ok > 189150 fails > I will continue tomorrow, but this should be helpful. > > 189150 is in the middle of a big string of related commits. Try updating to the following change numbers and retesting: 189088 189107 189161 If the last one does not work, try editing /sys/dev/amr/amr.c to change #define AMR_ENABLE_CAM 1 to #define AMR_ENABLE_CAM 0 Scott From julian at elischer.org Sat Mar 28 22:20:16 2009 From: julian at elischer.org (Julian Elischer) Date: Sat Mar 28 22:20:23 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CEC261.4010803@freebsd.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> Message-ID: <49CF0523.8020905@elischer.org> David Xu wrote: > Julian Elischer wrote: >> Scott Long wrote: >>> I've been talking about this for years. All I need is help with the >>> VM magic to create the page on fork. I also want two pages, one global >>> for gettimeofday (and any other global data we can think of) and one >>> per-process for static data like getpid/getgid. >> >> interestingly it is even feasible to have a per-thread page.. >> it requires that the scheduler change a page table entry tough. >> > > I will knock his door at midnight if he added such a heavy weight > task in the scheduler, TLB shutdown is horrible, and big code size > squeezing out data from CPU cache is not idea model. > scheduler should be as simple as just a context switching routine. > :-) > > David Xu depends on the hardware. anyhow I was only saying it was possible, not necessarily good or even useful. From danny at cs.huji.ac.il Sat Mar 28 23:42:50 2009 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sat Mar 28 23:43:03 2009 Subject: amr driver broken since March 12 In-Reply-To: <49CEB652.8060003@samsco.org> References: <49CCDA41.4060101@samsco.org> <49CCF95F.1050307@samsco.org> <49CEB652.8060003@samsco.org> Message-ID: > Danny Braniss wrote: > > it seems March 12 was a bit off :-) > > it took some time, but I managed to close the gap: > > 189100 ok > > 189150 fails > > I will continue tomorrow, but this should be helpful. > > > > > > 189150 is in the middle of a big string of related commits. Try > updating to the following change numbers and retesting: > > 189088 > 189107 > 189161 > > If the last one does not work, try editing /sys/dev/amr/amr.c to change > > #define AMR_ENABLE_CAM 1 > > to > > #define AMR_ENABLE_CAM 0 > > Scott 189161 works, also for the iir now what? danny From danny at cs.huji.ac.il Sun Mar 29 01:00:42 2009 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sun Mar 29 01:01:00 2009 Subject: amr driver broken since March 12 In-Reply-To: <49CF1A94.6040703@samsco.org> References: <49CCDA41.4060101@samsco.org> <49CCF95F.1050307@samsco.org> <49CEB652.8060003@samsco.org> <49CF1A94.6040703@samsco.org> Message-ID: > Danny Braniss wrote: > >> Danny Braniss wrote: > >>> it seems March 12 was a bit off :-) > >>> it took some time, but I managed to close the gap: > >>> 189100 ok > >>> 189150 fails > >>> I will continue tomorrow, but this should be helpful. > >>> > >>> > >> 189150 is in the middle of a big string of related commits. Try > >> updating to the following change numbers and retesting: > >> > >> 189088 > >> 189107 > >> 189161 > >> > >> If the last one does not work, try editing /sys/dev/amr/amr.c to change > >> > >> #define AMR_ENABLE_CAM 1 > >> > >> to > >> > >> #define AMR_ENABLE_CAM 0 > >> > >> Scott > > > > 189161 works, also for the iir > > now what? > > > > Next set to try: > > 189219 broken > 189229 broken any point in going on? danny > 189253 > 189402 > 189531 > 189569 > 189591 > > Scott From scottl at samsco.org Sat Mar 28 23:52:40 2009 From: scottl at samsco.org (Scott Long) Date: Sun Mar 29 04:27:32 2009 Subject: amr driver broken since March 12 In-Reply-To: References: <49CCDA41.4060101@samsco.org> <49CCF95F.1050307@samsco.org> <49CEB652.8060003@samsco.org> Message-ID: <49CF1A94.6040703@samsco.org> Danny Braniss wrote: >> Danny Braniss wrote: >>> it seems March 12 was a bit off :-) >>> it took some time, but I managed to close the gap: >>> 189100 ok >>> 189150 fails >>> I will continue tomorrow, but this should be helpful. >>> >>> >> 189150 is in the middle of a big string of related commits. Try >> updating to the following change numbers and retesting: >> >> 189088 >> 189107 >> 189161 >> >> If the last one does not work, try editing /sys/dev/amr/amr.c to change >> >> #define AMR_ENABLE_CAM 1 >> >> to >> >> #define AMR_ENABLE_CAM 0 >> >> Scott > > 189161 works, also for the iir > now what? > Next set to try: 189219 189229 189253 189402 189531 189569 189591 Scott From scottl at samsco.org Sun Mar 29 07:23:19 2009 From: scottl at samsco.org (Scott Long) Date: Sun Mar 29 07:41:24 2009 Subject: amr driver broken since March 12 In-Reply-To: References: <49CCDA41.4060101@samsco.org> <49CCF95F.1050307@samsco.org> <49CEB652.8060003@samsco.org> <49CF1A94.6040703@samsco.org> Message-ID: <49CF8432.5090201@samsco.org> Danny Braniss wrote: >> Danny Braniss wrote: >>>> Danny Braniss wrote: >>>>> it seems March 12 was a bit off :-) >>>>> it took some time, but I managed to close the gap: >>>>> 189100 ok >>>>> 189150 fails >>>>> I will continue tomorrow, but this should be helpful. >>>>> >>>>> >>>> 189150 is in the middle of a big string of related commits. Try >>>> updating to the following change numbers and retesting: >>>> >>>> 189088 >>>> 189107 >>>> 189161 >>>> >>>> If the last one does not work, try editing /sys/dev/amr/amr.c to change >>>> >>>> #define AMR_ENABLE_CAM 1 >>>> >>>> to >>>> >>>> #define AMR_ENABLE_CAM 0 >>>> >>>> Scott >>> 189161 works, also for the iir >>> now what? >>> >> Next set to try: >> >> 189219 > broken >> 189229 > broken Ok, so 189161 works, 189219 doesn't, correct? If so, did you also make the change to amr.c yet? Scott From cokane at FreeBSD.org Sun Mar 29 10:53:02 2009 From: cokane at FreeBSD.org (Coleman Kane) Date: Sun Mar 29 10:53:14 2009 Subject: REQUEST FOR TESTERS: `devel/mingw32-gcc' In-Reply-To: <49CED0E0.90709@math.missouri.edu> References: <1761162510.20070729004710@serebryakov.spb.ru> <46ACDB4B.6090707@FreeBSD.org> <46CB159F.4030404@FreeBSD.org> <49CED0E0.90709@math.missouri.edu> Message-ID: <1238348186.2663.6.camel@localhost> On Sat, 2009-03-28 at 20:37 -0500, Stephen Montgomery-Smith wrote: > Coleman Kane wrote: > > > I haven't seen any activity on the above email, and I am curious if: > > 1) It was missed (and this really does affect people) > > 2) Nobody cross-compiles using the mingw32-* ports (it is really very > > handy!) > > 3) Nobody really cares that mingw32-gcc will move from 3.4.5 --> 4.2.0 > > > > Please, if this affects you test out the above port tarball! Otherwise, > > this will end up going in and not take into account any problems that > > might arise in your environment. > > > > -- > > Coleman Kane > > > I just saw that this message is about two years old, and that the commit > must have been made years ago. > > Sorry for the noise. Thanks. It was handled off-line and we did the upgrade. -- Coleman Kane -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090329/93be28ec/attachment.pgp From peterjeremy at optushome.com.au Sun Mar 29 11:07:50 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Sun Mar 29 11:08:07 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <3c0b01820903271119l4161c7b8yf74613b184add487@mail.gmail.com> References: <49CD0405.1060704@samsco.org> <5739.1238175087@critter.freebsd.dk> <3c0b01820903271119l4161c7b8yf74613b184add487@mail.gmail.com> Message-ID: <20090329180745.GB38985@server.vk2pj.dyndns.org> On 2009-Mar-27 14:19:16 -0400, Alexander Sack wrote: >On Fri, Mar 27, 2009 at 1:31 PM, Poul-Henning Kamp wrote: >> In message <49CD0405.1060704@samsco.org>, Scott Long writes: >> >>>I've been talking about this for years. ?All I need is help with the VM >>>magic to create the page on fork. ?I also want two pages, one global >>>for gettimeofday (and any other global data we can think of) and one >>>per-process for static data like getpid/getgid. gettimeofday is likely to be a mixture of global and per-core data so possibly a 3rd page containing per-core data is warranted. >I'm assuming folks are still in love with the TSC because it still the >cheapest as oppose ACPI-fast or HPET to even contemplate this? That is its major advantage. It might be feasible to export all the data necessary to implement the complete CLOCK_*_FAST family. >Also I thought at least PHK's comment (Sergey mentioned it) was true >regardless of bus, that the TSC is not consistent across multiple >packages (and for that matter I suppose cores) due to I *think* its >ISA lineage so how does this work again? TSC is nothing to do with ISA. The easiest way to build a counter that runs at CPU clock rate is to put it very close to the CPU/core and have different counters for each CPU/core, without any synchronisation between the different counters. > Won't the rate in which you >tick up be sporadic over the course of the process scheduled on >different cores? (i.e. depending on what core RDTSC happened to land >on) RDTSC will wind up on the same core that your thread of execution is running on and this is defined by the scheduler. IE, it's up to the scheduler to ensure that the correct page of global (or per-cpu) data is mapped. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090329/4f7b7176/attachment.pgp From peterjeremy at optushome.com.au Sun Mar 29 11:22:26 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Sun Mar 29 11:22:38 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CEC261.4010803@freebsd.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> Message-ID: <20090329182219.GC38985@server.vk2pj.dyndns.org> On 2009-Mar-29 08:35:45 +0800, David Xu wrote: >Julian Elischer wrote: >> interestingly it is even feasible to have a per-thread page.. >> it requires that the scheduler change a page table entry tough. > >I will knock his door at midnight if he added such a heavy weight >task in the scheduler, TLB shutdown is horrible, and big code size >squeezing out data from CPU cache is not idea model. >scheduler should be as simple as just a context switching routine. If the TSC is not consistent between all cores (which is probably the most common situation at present), then using the TSC implies knowing which core you are executing on. From a userland perspective, the easiest way to do this is to have a page of data that varies depending on which core you are executing on. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090329/f0e32b82/attachment.pgp From ken at mthelicon.com Sun Mar 29 11:26:08 2009 From: ken at mthelicon.com (Pegasus Mc Cleaft) Date: Sun Mar 29 11:26:15 2009 Subject: ping6 and traceroute6 trouble Message-ID: <200903291826.05152.ken@mthelicon.com> Hi Current and Hackers, I have only seen this recently and was wondering if anyone else can confirm this as a bug or perhaps a setup problem on my end. I think it started to appear with 8-Current from about the 25th of march. Any time I do a ping6 or traceroute6 I receive an error stating "Invalid value for hints." However, I am able to do all other functions (telnet, ssh, etc.) feathers# ping6 ipv6.google.com ping6: Invalid value for hints feathers# traceroute6 ipv6.google.com traceroute6: Invalid value for hints my If config looks like: re0: flags=8843 metric 0 mtu 1500 options=389b ether 00:1d:7d:07:24:1a inet 78.33.110.3 netmask 0xffffffe0 broadcast 78.33.110.31 inet6 fe80::21d:7dff:fe07:241a%re0 prefixlen 64 scopeid 0x1 inet6 2001:4d48:ad51:32:21d:7dff:fe07:241a prefixlen 64 autoconf media: Ethernet autoselect (100baseTX ) status: active Thanks in advance, Peg From ume at freebsd.org Sun Mar 29 11:48:32 2009 From: ume at freebsd.org (Hajimu UMEMOTO) Date: Sun Mar 29 11:48:39 2009 Subject: ping6 and traceroute6 trouble In-Reply-To: <200903291826.05152.ken@mthelicon.com> References: <200903291826.05152.ken@mthelicon.com> Message-ID: Hi, >>>>> On Sun, 29 Mar 2009 18:26:04 +0000 >>>>> Pegasus Mc Cleaft said: ken> I have only seen this recently and was wondering if anyone else can confirm ken> this as a bug or perhaps a setup problem on my end. I think it started to ken> appear with 8-Current from about the 25th of march. ken> Any time I do a ping6 or traceroute6 I receive an error stating "Invalid ken> value for hints." However, I am able to do all other functions (telnet, ssh, ken> etc.) ken> feathers# ping6 ipv6.google.com ken> ping6: Invalid value for hints ken> feathers# traceroute6 ipv6.google.com ken> traceroute6: Invalid value for hints I've committed the change to lib/libc/net/getaddrinfo.c little while ago that also fixed the problem. Please re-cvsup and try it. http://svn.freebsd.org/viewvc/base/head/lib/libc/net/getaddrinfo.c?r1=190416&r2=190525&view=patch Sincerely, -- Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan ume@mahoroba.org ume@{,jp.}FreeBSD.org http://www.imasy.org/~ume/ From pisymbol at gmail.com Sun Mar 29 13:06:06 2009 From: pisymbol at gmail.com (Alexander Sack) Date: Sun Mar 29 13:06:19 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <20090329180745.GB38985@server.vk2pj.dyndns.org> References: <49CD0405.1060704@samsco.org> <5739.1238175087@critter.freebsd.dk> <3c0b01820903271119l4161c7b8yf74613b184add487@mail.gmail.com> <20090329180745.GB38985@server.vk2pj.dyndns.org> Message-ID: <3c0b01820903291306i49cf284etd10f693392b7b9a1@mail.gmail.com> On Sun, Mar 29, 2009 at 2:07 PM, Peter Jeremy wrote: > On 2009-Mar-27 14:19:16 -0400, Alexander Sack wrote: >>I'm assuming folks are still in love with the TSC because it still the >>cheapest as oppose ACPI-fast or HPET to even contemplate this? > > That is its major advantage. ?It might be feasible to export all the > data necessary to implement the complete CLOCK_*_FAST family. Understood. >>Also I thought at least PHK's comment (Sergey mentioned it) was true >>regardless of bus, that the TSC is not consistent across multiple >>packages (and for that matter I suppose cores) due to I *think* its >>ISA lineage so how does this work again? > > TSC is nothing to do with ISA. ?The easiest way to build a counter > that runs at CPU clock rate is to put it very close to the CPU/core > and have different counters for each CPU/core, without any > synchronisation between the different counters. Understood thanks. I don't know why ISA and TSC are in my head. Please excuse. >> ?Won't the rate in which you >>tick up be sporadic over the course of the process scheduled on >>different cores? ?(i.e. depending on what core RDTSC happened to land >>on) > > RDTSC will wind up on the same core that your thread of execution is > running on and this is defined by the scheduler. ?IE, it's up to the > scheduler to ensure that the correct page of global (or per-cpu) data > is mapped. OK. But then why not do what I *think* Solaris does in the first place, sync the cores using a master/slave to effectively create an invariant TSC i.e if you are going to buy the overhead in the scheduler why not do the dirty work at the source instead of all this overhead in either the scheduler or the logic to know that this thread of execution was on that core and is using this TSC etc. etc. I believe this topic has been re-hashed before I don't remember the outcome so again excuse... :D Thanks! -aps From chuckr at telenix.org Sun Mar 29 14:40:18 2009 From: chuckr at telenix.org (Chuck Robey) Date: Sun Mar 29 14:40:24 2009 Subject: the web site Message-ID: <49CFEACE.5010808@telenix.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I just had to see if I could locate if there was a gnome project page by looking at the FreeBSD web pages. Why don't you try that yourself? I'll tell you, it's really FAR from being obvious. I'm just saying, even if folks don't want to change the web page, then a TOC-like section should be added near the bottom, to make navigation easier. I've tried my own hand at the web page design. I think I only proved to myself that I'm no artist, but it wouldn't maek things TOO ugly just to add a section at the bottom, would it? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAknP6s4ACgkQz62J6PPcoOkwzgCfXJtkN/PRNhRqvApSJCzjS6uj vW0An1yPG0wzG3d3i3njO5H3gJ2p8w9I =aBYh -----END PGP SIGNATURE----- From ken at mthelicon.com Sun Mar 29 14:56:31 2009 From: ken at mthelicon.com (Pegasus Mc Cleaft) Date: Sun Mar 29 14:56:38 2009 Subject: ping6 and traceroute6 trouble In-Reply-To: References: <200903291826.05152.ken@mthelicon.com> Message-ID: <200903292156.24202.ken@mthelicon.com> Hi Hajimu, > ken> Any time I do a ping6 or traceroute6 I receive an error stating > "Invalid ken> value for hints." However, I am able to do all other > functions (telnet, ssh, ken> etc.) > > ken> feathers# ping6 ipv6.google.com > ken> ping6: Invalid value for hints > > ken> feathers# traceroute6 ipv6.google.com > ken> traceroute6: Invalid value for hints > > I've committed the change to lib/libc/net/getaddrinfo.c little while > ago that also fixed the problem. Please re-cvsup and try it. > > http://svn.freebsd.org/viewvc/base/head/lib/libc/net/getaddrinfo.c?r1=19041 >6&r2=190525&view=patch Yes, that fixed things nicely. Thank you. Best regards, Peg From aram.h at mgk.ro Sun Mar 29 15:52:30 2009 From: aram.h at mgk.ro (Aram Havarneanu) Date: Sun Mar 29 15:52:36 2009 Subject: Shared Disk/Transactional/Distributed file system (GSoC Proposal) Message-ID: I have been giving some thought lately on some ideas I would like to do for Google Summer of Code. I haven't posted my application yet, as I hope to get some feedback first. I want to make an OpenVMS inspired file system. The key elements would be record oriented I/O, transaction processing and asynchronous I/O. Ideally, the file system will have redundancy features (for high availability) implemented through clustering. The file system should be a shared disk file system, usable in a SAN environment with multiple clients that use the exported block devices simultaneously. The first design issue is weather spreading the file system over a number of machines on the network is a feature that's relevant today or not. OpenVMS did that and it provided redundancy (you could mirror data between nodes) and performance (multiple machines could be serving you data at the same time). This days people tent to centralize storage in a SAN. The SAN provides it's own redundancy, and so far performance is not an issue, as SANs seem to handle scalability extremely well. Even though the idea of spreading the file system through the network seems to have some potential theoretical performance advantages, the current network throughputs are a bottleneck for taking advantage of it. A current hard drive is faster then Gigabit Ethernet, 10Gigabit which still has a prohibitive price today is easily saturated in a small cluster of only a few machines. Other network technologies are again, prohibitively priced. In the past, you could not put to much storage on one machine, so the ability to spread storage across multiple machines was important, but today storage is almost free and usually you can scale it enough in a SAN. Another question is whether to make it a pure record oriented I/O file system, or also implement traditional I/O. A pure record oriented I/O file system would make the distributed lock manager's job much simpler, as there is a simpler mapping between raw bits from the block device and the resources (files/records/fields) the DLM manages. Of course, the VFS interface would be just for convenience for such a system, as the abstraction it provides would add no value to such a file system. But for such a file system to be really useful needs to be used by the transactional, record oriented I/O API anyway. The other option would be to make it a mixed file system, like Files-11 in OpenVMS, with traditional I/O and record oriented I/O. In that case, I would probably use UFS on-disk structure, so it would be more likely an addition to UFS than a new file system. But then of course, FreeBSD has ZFS which has a really nice layered architecture, so I could just use the ZFS lower layers that deal with block devices and can be used for things like redundancy and implement a file system on this architecture. But FreeBSD also has GEOM, and with some clever programming I can also use that. There are many options of doing things. One other thing I like to address is asynchronous I/O. In a way it's just fancy buffered I/O, but from the perspective of the programmer that uses the API, it is much more than that. There are a lot of cases where you want to make lots of unrelated commits to the resource pool, and this I/O operations rarely fail. Or there is the case where you make a big commit that takes time and you want to make smaller commits in the same time that are more crucial to be finished first than the big one and you don't want the big commit to block the smaller ones. Of course you can solve this issues in multiple ways, but with ASYNC I/O it makes the job much easier to the programmer. You just make transactions, install handlers for Asynchronous System Traps that manage aborted transactions, or finished transactions etc. The AST mechanism works in a way like UNIX signals, but the ASTs don't stop system calls and can be queued with some mechanism. There is also the question about how to solve the issue of cache coherency between different nodes. With ASYNC I/O, caching write operations is not that important, but I think that caching read operations is important. This must be implemented in the distributed lock manager. Simply put, what is not locked by anybody should be current in the local cache and can be accessed from there after you make a request to the DLM and the DLM grants the read lock. The distributed lock manager maintains a directory of requested resources, either in concurrent read mode, concurrent write mode, protected read mode, protected write mode and exclusive lock mode. When transactions are made, it is the responsibility of the DLM to invalidate caches. This implementation is pretty expensive in terms of time spend in the round trip to the DLM, but I think the time necessary to make a request can be less then 0.75ms for LANs, and disk access time is in the order of 5ms for fast disks, so that would not be an issue. >From a high level programmer perspective things work like this: 1) You make a request for a lock to the DLM. Requests are queued. Requests can be for concurrent read (desire to read, doesn't stop other from updating), concurrent write (non blocking read-write), protected read (locks the resource globally in a read only mode preventing other to modify it), protected write (locks the resource globally so that only you can update it) and exclusive mode, where only you can hold a lock. Locks can be on full files, records or even fields allowing for flexibility and granularity. 2) Eventually the DLM grants you a lock and you can do transactions with the resource. Transactions are asynchronous by default, but can be made synchronous if needed. You can install handlers for ASTs to do various tasks when some events occur. 3) You release the lock. There is a lot of stuff than can be done and it can be done in various ways, so basically that is why I posted this on the list -- for discussion and suggestions. My ideas may seem vague at the moment, because with this simple ideas you can implement a lot of different things. Hopefully with your help we will be able to come with something that is interesting, usable and feasible to be done in such a short time, at least to some prototype level. In any case, if I do this, I don't plan to stop working on it after GSoC. I will work on it as long as it is necessary. Any feedback is greatly appreciated. I would also appreciate any hints toward general FreeBSD kernel programming. I read the developer docs on the website, I have (and mostly read) "The Design and Implementation of the FreeBSD Operating System" by Marshall Kirk McKusick and, George V. Neville-Neil (also read the 4.4BSD version) and also read "Designing BSD Rootkits -- An Introduction to Kernel Hacking" by Joseph Kong. Thanks, -- Aram H?v?rneanu From kientzle at freebsd.org Sun Mar 29 16:58:16 2009 From: kientzle at freebsd.org (Tim Kientzle) Date: Sun Mar 29 16:58:23 2009 Subject: Shared Disk/Transactional/Distributed file system (GSoC Proposal) In-Reply-To: References: Message-ID: <49D00B16.20507@freebsd.org> Aram Havarneanu wrote: > I have been giving some thought lately on some ideas I would like to > do for Google Summer of Code. I haven't posted my application yet, as > I hope to get some feedback first. An interesting idea, but it sounds much too ambitious for a six-week summer project. I suggest you try to come up with something rather a bit smaller. There are a couple of proposals on the FreeBSD Summer of Code site for small projects that deal with filesystem issues. Especially if you've not worked with the FreeBSD kernel before, it's probably advisable to first tackle a small project that would fix some issues with existing file system implementations so you can learn what a real FS implementation looks like. Once you know your way around the kernel and the filesystem interfaces, then there will be plenty of time to tackle designing your own file system from scratch. There's a suggestion on the FreeBSD Summer of Code page, if I recall correctly, to fix some issues with the msdosfs file system implementation, and I seem to recall some people asking recently about zisofs support for the iso9660 driver. Of course, a lot depends on your particular background. You didn't say how much work of this sort you'd done in the past. Cheers, Tim Kientzle From davidxu at freebsd.org Sun Mar 29 18:40:07 2009 From: davidxu at freebsd.org (David Xu) Date: Sun Mar 29 18:40:19 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CF0523.8020905@elischer.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> Message-ID: <49D022EF.8030305@freebsd.org> Julian Elischer wrote: > depends on the hardware. > anyhow I was only saying it was possible, not necessarily > good or even useful. > > I had done some works for thread private page shared by kernel and userland when I was doing userland spinlock, if userland asks a page, kernel will allocate it and put some interesting thing in it by scheduler etcs, these code may be useful. From davidxu at freebsd.org Sun Mar 29 18:43:22 2009 From: davidxu at freebsd.org (David Xu) Date: Sun Mar 29 18:43:29 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D022EF.8030305@freebsd.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> <49D022EF.8030305@freebsd.org> Message-ID: <49D023B7.8070402@freebsd.org> David Xu wrote: > Julian Elischer wrote: > >> depends on the hardware. >> anyhow I was only saying it was possible, not necessarily >> good or even useful. >> >> > > I had done some works for thread private page shared by kernel > and userland when I was doing userland spinlock, if userland asks > a page, kernel will allocate it and put some interesting thing in > it by scheduler etcs, these code may be useful. > FYI: http://people.freebsd.org/~davidxu/schedctl/ From rwmaillists at googlemail.com Sun Mar 29 21:34:47 2009 From: rwmaillists at googlemail.com (RW) Date: Sun Mar 29 21:34:54 2009 Subject: the web site In-Reply-To: <49CFEACE.5010808@telenix.org> References: <49CFEACE.5010808@telenix.org> Message-ID: <20090330051221.219c8a8c@gumby.homeunix.com> On Sun, 29 Mar 2009 17:40:30 -0400 Chuck Robey wrote: > I just had to see if I could locate if there was a gnome project page > by looking at the FreeBSD web pages. Why don't you try that > yourself? I'll tell you, it's really FAR from being obvious. I'm > just saying, even if folks don't want to change the web page, then a > TOC-like section should be added near the bottom, to make navigation > easier. If you click on "site map", at the bottom of the page, the gnome link is on the first line. From julian at elischer.org Sun Mar 29 21:49:20 2009 From: julian at elischer.org (Julian Elischer) Date: Sun Mar 29 21:49:33 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D023B7.8070402@freebsd.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> <49D022EF.8030305@freebsd.org> <49D023B7.8070402@freebsd.org> Message-ID: <49D04F63.4010800@elischer.org> David Xu wrote: > David Xu wrote: >> Julian Elischer wrote: >> >>> depends on the hardware. >>> anyhow I was only saying it was possible, not necessarily >>> good or even useful. >>> >>> >> >> I had done some works for thread private page shared by kernel >> and userland when I was doing userland spinlock, if userland asks >> a page, kernel will allocate it and put some interesting thing in >> it by scheduler etcs, these code may be useful. >> > FYI: > http://people.freebsd.org/~davidxu/schedctl/ reading this quickly, you allocate a separately addressed page for each thread, but, how do you use it? From davidxu at freebsd.org Sun Mar 29 21:58:59 2009 From: davidxu at freebsd.org (David Xu) Date: Sun Mar 29 21:59:05 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D04F63.4010800@elischer.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> <49D022EF.8030305@freebsd.org> <49D023B7.8070402@freebsd.org> <49D04F63.4010800@elischer.org> Message-ID: <49D0518D.4040000@freebsd.org> Julian Elischer wrote: > David Xu wrote: >> David Xu wrote: >>> Julian Elischer wrote: >>> >>>> depends on the hardware. >>>> anyhow I was only saying it was possible, not necessarily >>>> good or even useful. >>>> >>>> >>> >>> I had done some works for thread private page shared by kernel >>> and userland when I was doing userland spinlock, if userland asks >>> a page, kernel will allocate it and put some interesting thing in >>> it by scheduler etcs, these code may be useful. >>> >> FYI: >> http://people.freebsd.org/~davidxu/schedctl/ > > reading this quickly, you allocate a separately addressed page for > each thread, but, how do you use it? > > I store the address in userland TLS area, then get it when I want to check some scheduling informations. From julian at elischer.org Sun Mar 29 22:02:54 2009 From: julian at elischer.org (Julian Elischer) Date: Sun Mar 29 22:03:07 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D0518D.4040000@freebsd.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> <49D022EF.8030305@freebsd.org> <49D023B7.8070402@freebsd.org> <49D04F63.4010800@elischer.org> <49D0518D.4040000@freebsd.org> Message-ID: <49D05292.30701@elischer.org> David Xu wrote: > Julian Elischer wrote: >> David Xu wrote: >>> David Xu wrote: >>>> Julian Elischer wrote: >>>> >>>>> depends on the hardware. >>>>> anyhow I was only saying it was possible, not necessarily >>>>> good or even useful. >>>>> >>>>> >>>> >>>> I had done some works for thread private page shared by kernel >>>> and userland when I was doing userland spinlock, if userland asks >>>> a page, kernel will allocate it and put some interesting thing in >>>> it by scheduler etcs, these code may be useful. >>>> >>> FYI: >>> http://people.freebsd.org/~davidxu/schedctl/ >> >> reading this quickly, you allocate a separately addressed page for >> each thread, but, how do you use it? >> >> > I store the address in userland TLS area, then get it when I want to > check some scheduling informations. and the scheduler writes out interesting information to that location?... From davidxu at freebsd.org Sun Mar 29 22:05:38 2009 From: davidxu at freebsd.org (David Xu) Date: Sun Mar 29 22:05:45 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D05292.30701@elischer.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> <49D022EF.8030305@freebsd.org> <49D023B7.8070402@freebsd.org> <49D04F63.4010800@elischer.org> <49D0518D.4040000@freebsd.org> <49D05292.30701@elischer.org> Message-ID: <49D0531F.1000005@freebsd.org> Julian Elischer wrote: > David Xu wrote: >> Julian Elischer wrote: >>> David Xu wrote: >>>> David Xu wrote: >>>>> Julian Elischer wrote: >>>>> >>>>>> depends on the hardware. >>>>>> anyhow I was only saying it was possible, not necessarily >>>>>> good or even useful. >>>>>> >>>>>> >>>>> >>>>> I had done some works for thread private page shared by kernel >>>>> and userland when I was doing userland spinlock, if userland asks >>>>> a page, kernel will allocate it and put some interesting thing in >>>>> it by scheduler etcs, these code may be useful. >>>>> >>>> FYI: >>>> http://people.freebsd.org/~davidxu/schedctl/ >>> >>> reading this quickly, you allocate a separately addressed page for >>> each thread, but, how do you use it? >>> >>> >> I store the address in userland TLS area, then get it when I want to >> check some scheduling informations. > > and the scheduler writes out interesting information to that > location?... > > Yes. From stephen at missouri.edu Sun Mar 29 22:52:56 2009 From: stephen at missouri.edu (Stephen Montgomery-Smith) Date: Sun Mar 29 22:53:03 2009 Subject: the web site In-Reply-To: <20090330051221.219c8a8c@gumby.homeunix.com> References: <49CFEACE.5010808@telenix.org> <20090330051221.219c8a8c@gumby.homeunix.com> Message-ID: <49D056F5.1030503@missouri.edu> RW wrote: > On Sun, 29 Mar 2009 17:40:30 -0400 > Chuck Robey wrote: > >> I just had to see if I could locate if there was a gnome project page >> by looking at the FreeBSD web pages. Why don't you try that >> yourself? I'll tell you, it's really FAR from being obvious. I'm >> just saying, even if folks don't want to change the web page, then a >> TOC-like section should be added near the bottom, to make navigation >> easier. > > If you click on "site map", at the bottom of the page, the gnome link > is on the first line. There is also the handy "search" feature which will tell you the following: Search Results The archive www contains the following items relevant to `gnome': Didn't get what you expected? Look here for searching hints. Return to the search page Nothing found. From phk at phk.freebsd.dk Mon Mar 30 00:32:47 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Mon Mar 30 00:32:55 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: Your message of "Mon, 30 Mar 2009 05:07:45 +1100." <20090329180745.GB38985@server.vk2pj.dyndns.org> Message-ID: <9969.1238398362@critter.freebsd.dk> In message <20090329180745.GB38985@server.vk2pj.dyndns.org>, Peter Jeremy write s: >>I'm assuming folks are still in love with the TSC because it still the >>cheapest as oppose ACPI-fast or HPET to even contemplate this? > >That is its major advantage. It might be feasible to export all the >data necessary to implement the complete CLOCK_*_FAST family. The general attraction is that it can be read from userland by unpriviledged programs. On systems where the ACPI or HPET hardware can be memory-mapped, I should be equally possible to map those read-only into userland processes. Now _THAT_ would be interesting. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From avg at icyb.net.ua Mon Mar 30 03:52:32 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Mon Mar 30 03:52:40 2009 Subject: hot-attach SATA drive Message-ID: <49D0A46B.3000306@icyb.net.ua> Recently I tried to hot-attach a SATA drive to a running system. Controller is ICH9 in AHCI mode. Physically/electronically everything went smoothly, the drive spun-up. Then I tried to detach and re-attach all channels with no devices on them using atacontrol. I did it 3 times to be sure, but no new disk showed up. Then I finally rebooted, the disk showed up OK. Question: was hot-attach expected to work? Is there a limitation in hardware or in our driver? Note: I attached the drive to a regular SATA port, not eSATA. -- Andriy Gapon From scottl at samsco.org Sun Mar 29 22:03:34 2009 From: scottl at samsco.org (Scott Long) Date: Mon Mar 30 04:14:33 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D0518D.4040000@freebsd.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <49CF0523.8020905@elischer.org> <49D022EF.8030305@freebsd.org> <49D023B7.8070402@freebsd.org> <49D04F63.4010800@elischer.org> <49D0518D.4040000@freebsd.org> Message-ID: <49D05294.6040200@samsco.org> David Xu wrote: > Julian Elischer wrote: >> David Xu wrote: >>> David Xu wrote: >>>> Julian Elischer wrote: >>>> >>>>> depends on the hardware. >>>>> anyhow I was only saying it was possible, not necessarily >>>>> good or even useful. >>>>> >>>>> >>>> >>>> I had done some works for thread private page shared by kernel >>>> and userland when I was doing userland spinlock, if userland asks >>>> a page, kernel will allocate it and put some interesting thing in >>>> it by scheduler etcs, these code may be useful. >>>> >>> FYI: >>> http://people.freebsd.org/~davidxu/schedctl/ >> >> reading this quickly, you allocate a separately addressed page for >> each thread, but, how do you use it? >> >> > I store the address in userland TLS area, then get it when I want to > check some scheduling informations. > Interesting, I was wondering earlier today if pointing to the per-thread syspage in from the TLS area would save the TLB invalidate that you were concerned about. Scott From mav at FreeBSD.org Mon Mar 30 04:15:06 2009 From: mav at FreeBSD.org (Alexander Motin) Date: Mon Mar 30 04:15:15 2009 Subject: hot-attach SATA drive In-Reply-To: <49D0A46B.3000306@icyb.net.ua> References: <49D0A46B.3000306@icyb.net.ua> Message-ID: <49D0A99B.4030908@FreeBSD.org> Andriy Gapon wrote: > Recently I tried to hot-attach a SATA drive to a running system. > Controller is ICH9 in AHCI mode. Physically/electronically everything went > smoothly, the drive spun-up. Then I tried to detach and re-attach all channels > with no devices on them using atacontrol. I did it 3 times to be sure, but no new > disk showed up. Then I finally rebooted, the disk showed up OK. > > Question: was hot-attach expected to work? Is there a limitation in hardware or in > our driver? > > Note: I attached the drive to a regular SATA port, not eSATA. Which system version do you use? With recent CURRENT I have successfully tested insert/remove SATA drives with ICH8, ICH8M and JMB363 AHCI controllers channel attach/detach. Theoretically it is possible to insert/remove SATA drives even without channel attach/detach. Remove works fine, but such really hot insertion functionality is not implemented properly now and so blocked. -- Alexander Motin From avg at icyb.net.ua Mon Mar 30 04:57:49 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Mon Mar 30 04:57:56 2009 Subject: hot-attach SATA drive In-Reply-To: <49D0A99B.4030908@FreeBSD.org> References: <49D0A46B.3000306@icyb.net.ua> <49D0A99B.4030908@FreeBSD.org> Message-ID: <49D0B3B7.60400@icyb.net.ua> on 30/03/2009 14:14 Alexander Motin said the following: > Andriy Gapon wrote: >> Recently I tried to hot-attach a SATA drive to a running system. >> Controller is ICH9 in AHCI mode. Physically/electronically everything went >> smoothly, the drive spun-up. Then I tried to detach and re-attach all channels >> with no devices on them using atacontrol. I did it 3 times to be sure, but no new >> disk showed up. Then I finally rebooted, the disk showed up OK. >> >> Question: was hot-attach expected to work? Is there a limitation in hardware or in >> our driver? >> >> Note: I attached the drive to a regular SATA port, not eSATA. > > Which system version do you use? With recent CURRENT I have successfully > tested insert/remove SATA drives with ICH8, ICH8M and JMB363 AHCI > controllers channel attach/detach. Theoretically it is possible to > insert/remove SATA drives even without channel attach/detach. Remove > works fine, but such really hot insertion functionality is not > implemented properly now and so blocked. It was stable/7, amd64. Maybe there is a small subset of the changes in current that I could try in stable/7? -- Andriy Gapon From joerg at britannica.bec.de Mon Mar 30 05:28:27 2009 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Mon Mar 30 05:28:34 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <9969.1238398362@critter.freebsd.dk> References: <20090329180745.GB38985@server.vk2pj.dyndns.org> <9969.1238398362@critter.freebsd.dk> Message-ID: <20090330122821.GA1391@britannica.bec.de> On Mon, Mar 30, 2009 at 07:32:42AM +0000, Poul-Henning Kamp wrote: > On systems where the ACPI or HPET hardware can be memory-mapped, I should > be equally possible to map those read-only into userland processes. Both are IO memory and contain other data. There is also the question of how "undefined" is implemented for locked bus cycles to them... Joerg From aram.h at mgk.ro Mon Mar 30 05:46:12 2009 From: aram.h at mgk.ro (Aram Havarneanu) Date: Mon Mar 30 05:46:18 2009 Subject: Shared Disk/Transactional/Distributed file system (GSoC Proposal) In-Reply-To: <49D00B16.20507@freebsd.org> References: <49D00B16.20507@freebsd.org> Message-ID: 2009/3/30 Tim Kientzle : > Aram Havarneanu wrote: >> >> I have been giving some thought lately on some ideas I would like to >> do for Google Summer of Code. I haven't posted my application yet, as >> I hope to get some feedback first. > > An interesting idea, but it sounds much too > ambitious for a six-week summer project. > You are right of course, this could not be done in the required time. > There are a couple of proposals on the FreeBSD > Summer of Code site [...] Oh yes, I read them all, it's just that I had this idea for some time, and I wished to implement it one way or another on some system that I like. > Of course, a lot depends on your particular > background. ?You didn't say how much work of > this sort you'd done in the past. > Well, I didn't do file system work in the past, if that's what you are asking. I started programming about 8-9 years ago, and I mostly did low-level C stuff ever since. Started with DOS, quickly got rid of that problem, moved to Linux where I learned most of what I know about UNIX userland stuff, didn't like Linux for kernel work for reasons I won't detail here, moved to Windows, did various stuff both in the userland and in the kernel, started doing stuff in Windows for money (like 2 years ago), started doing stuff in Solaris, BSD and recently Mac OS X. I always found kernel programming much challenging then userland stuff and I never really cared about the really high level programming stuff. Writing assembly code to make hardware work is something that I enjoy, writing GUI applications is something I hope I'll never do again. Actually I would prefer to do some driver work to improve hardware support in FreeBSD instead of all this file system thing, but I'd need hardware I don't have, so that's out of the question. I would like to do that EFI boot project, because that's something I would want for myself, but I only have 1 EFI machine and I can't afford to do development work on it because I need it for something else. Regarding the original problem, I propose a much simple implementation using ZFS. Database people want transactional (asynchronous) record oriented I/O. Well... probably depends on what database people you ask, because some would say synchronous I/O is imperative (that is why there is ZIL in the ZFS stack, isn't it?) and others will say the opposite thing. Anyway, you can always do SYNC I/O if you need it or want to, so I see the ASYNC I/O feature as plus, because some people will benefit and the others won't be affected. So, as I was saying, using ZFS makes the problem much simpler because ZFS already has transactional capabilities (layers around DMU) and ZFS does all the storage management for free (in the pooled storage layer). All you need to do is to write something in the Interface Layer. The current interfaces are ZPL, the POSIX layer hooked in the VFS, ZVOL and /dev/zfs. Only ZPL is used by the general public. What I propose is a new interface (akin to ZPL in some respects) but not hooked in the VFS (well, some degree of POSIX I/O maybe would help). This would export some syscalls for use by database consumers. I hope I could model the thing after OpenVMS APIs. What do you think of this idea? Thanks, -- Aram H?v?rneanu From aram.h at mgk.ro Mon Mar 30 05:52:00 2009 From: aram.h at mgk.ro (Aram Havarneanu) Date: Mon Mar 30 05:52:06 2009 Subject: Shared Disk/Transactional/Distributed file system (GSoC Proposal) In-Reply-To: References: <49D00B16.20507@freebsd.org> Message-ID: Oh, and of course, doing stuff with ZFS will mean that (Open)Solaris and Mac OS X will benefit as well, which I see as a good thing. From mav at FreeBSD.org Mon Mar 30 06:01:35 2009 From: mav at FreeBSD.org (Alexander Motin) Date: Mon Mar 30 06:01:42 2009 Subject: hot-attach SATA drive In-Reply-To: <49D0B3B7.60400@icyb.net.ua> References: <49D0A46B.3000306@icyb.net.ua> <49D0A99B.4030908@FreeBSD.org> <49D0B3B7.60400@icyb.net.ua> Message-ID: <49D0C2AD.70706@FreeBSD.org> Andriy Gapon wrote: > on 30/03/2009 14:14 Alexander Motin said the following: >> Andriy Gapon wrote: >>> Recently I tried to hot-attach a SATA drive to a running system. >>> Controller is ICH9 in AHCI mode. Physically/electronically everything went >>> smoothly, the drive spun-up. Then I tried to detach and re-attach all channels >>> with no devices on them using atacontrol. I did it 3 times to be sure, but no new >>> disk showed up. Then I finally rebooted, the disk showed up OK. >>> >>> Question: was hot-attach expected to work? Is there a limitation in hardware or in >>> our driver? >>> >>> Note: I attached the drive to a regular SATA port, not eSATA. >> Which system version do you use? With recent CURRENT I have successfully >> tested insert/remove SATA drives with ICH8, ICH8M and JMB363 AHCI >> controllers channel attach/detach. Theoretically it is possible to >> insert/remove SATA drives even without channel attach/detach. Remove >> works fine, but such really hot insertion functionality is not >> implemented properly now and so blocked. > > It was stable/7, amd64. > Maybe there is a small subset of the changes in current that I could try in stable/7? There is significant sources difference due to modularization work done on CURRENT, so it is not so easy to directly compare sources or backport something. I haven't actually looked on/tested 7-STABLE much. -- Alexander Motin From kientzle at freebsd.org Mon Mar 30 12:29:09 2009 From: kientzle at freebsd.org (Tim Kientzle) Date: Mon Mar 30 12:29:15 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <9969.1238398362@critter.freebsd.dk> References: <9969.1238398362@critter.freebsd.dk> Message-ID: <49D11D7F.2020503@freebsd.org> Poul-Henning Kamp wrote: > In message <20090329180745.GB38985@server.vk2pj.dyndns.org>, Peter Jeremy write > s: > >>> I'm assuming folks are still in love with the TSC because it still the >>> cheapest as oppose ACPI-fast or HPET to even contemplate this? >> That is its major advantage. It might be feasible to export all the >> data necessary to implement the complete CLOCK_*_FAST family. > > The general attraction is that it can be read from userland by unpriviledged > programs. > > On systems where the ACPI or HPET hardware can be memory-mapped, I should > be equally possible to map those read-only into userland processes. > > Now _THAT_ would be interesting. Which brings us back to having a page of code provided by the kernel so that the kernel can determine the appropriate implementation (depending on the hardware availability) and so that userland can invoke the functions without going through a task switch. Libc can then either invoke these directly or, if the page is unavailable for any reason, use the system calls. Tim From prashant.vaibhav at gmail.com Mon Mar 30 14:32:37 2009 From: prashant.vaibhav at gmail.com (Prashant Vaibhav) Date: Mon Mar 30 14:32:51 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D11D7F.2020503@freebsd.org> References: <9969.1238398362@critter.freebsd.dk> <49D11D7F.2020503@freebsd.org> Message-ID: <17560ccf0903301432t6a94dd86tb2f2a1a8d6edd7c2@mail.gmail.com> ...and that is _exactly_ what I propose(d) in the beginning and what OSX already does. Further, keeping the shared page and functions fixed at the end of the memory space has advantages like not needing any special linking, being easily accessible for code jumps or data reads, and so on [1]. The TSC issues are but one part of the puzzle. After this week-long discussion I still can't decide whether this was something that's desirable at all: keeping in mind that it's among the few project ideas tagged as "Suggested for Google Summer of Code 2009" on the FreeBSD website. :-\ Though I've been reading mailing list archives, and the various handbooks, I'm not familiar well enough with other parts of the freebsd kernel to draft another concrete proposal on my own at this time. [1] *Mac OS X Internals: A Systems Approach,* p 595, Amit Singh, ISBN 0321278542 On Tue, Mar 31, 2009 at 12:59 AM, Tim Kientzle wrote: > Poul-Henning Kamp wrote: > >> In message <20090329180745.GB38985@server.vk2pj.dyndns.org>, Peter Jeremy >> write >> s: >> >> I'm assuming folks are still in love with the TSC because it still the >>>> cheapest as oppose ACPI-fast or HPET to even contemplate this? >>>> >>> That is its major advantage. It might be feasible to export all the >>> data necessary to implement the complete CLOCK_*_FAST family. >>> >> >> The general attraction is that it can be read from userland by >> unpriviledged >> programs. >> >> On systems where the ACPI or HPET hardware can be memory-mapped, I should >> be equally possible to map those read-only into userland processes. >> >> Now _THAT_ would be interesting. >> > > Which brings us back to having a page of code > provided by the kernel so that the kernel can > determine the appropriate implementation > (depending on the hardware availability) and so > that userland can invoke the functions without > going through a task switch. Libc can then > either invoke these directly or, if the page is > unavailable for any reason, use the system calls. > > Tim > > From sobomax at FreeBSD.org Mon Mar 30 18:02:11 2009 From: sobomax at FreeBSD.org (Maxim Sobolev) Date: Mon Mar 30 18:02:24 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49CD0405.1060704@samsco.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> Message-ID: <49D162C4.3050006@FreeBSD.org> Scott Long wrote: > I've been talking about this for years. All I need is help with the VM > magic to create the page on fork. I also want two pages, one global > for gettimeofday (and any other global data we can think of) and one > per-process for static data like getpid/getgid. I believe somebody suggested that no real VM magic is needed and the libc should be in charge of opening special pseudo-device and doing necessary mmap(2) magic to get the page mapped in when user calls gettimeofday()/getpid()/getid() etc for the first time. -Maxim From sobomax at FreeBSD.org Mon Mar 30 18:31:14 2009 From: sobomax at FreeBSD.org (Maxim Sobolev) Date: Mon Mar 30 18:31:26 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <20090329182219.GC38985@server.vk2pj.dyndns.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <20090329182219.GC38985@server.vk2pj.dyndns.org> Message-ID: <49D1725A.1020005@FreeBSD.org> Peter Jeremy wrote: > On 2009-Mar-29 08:35:45 +0800, David Xu wrote: >> Julian Elischer wrote: >>> interestingly it is even feasible to have a per-thread page.. >>> it requires that the scheduler change a page table entry tough. >> I will knock his door at midnight if he added such a heavy weight >> task in the scheduler, TLB shutdown is horrible, and big code size >> squeezing out data from CPU cache is not idea model. >> scheduler should be as simple as just a context switching routine. > > If the TSC is not consistent between all cores (which is probably > the most common situation at present), then using the TSC implies > knowing which core you are executing on. From a userland perspective, > the easiest way to do this is to have a page of data that varies > depending on which core you are executing on. It's not that easy, unless you can pin thread to a specific core before reading that page. I.e. imagine the case when your thread reads per-cpu page, get preempted and scheduled to a different core, then executes RDTSC there, still thinking it got TSC reading from the first core. Even if it does re-read from that page again after reading TSC to determine if he has read the correct TSC, still it's possible (though not very likely) that it has been preempted again and scheduled to the first core after reading the TSC. -Maxim From sobomax at FreeBSD.org Mon Mar 30 18:45:38 2009 From: sobomax at FreeBSD.org (Maxim Sobolev) Date: Mon Mar 30 18:45:51 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: References: <33531707.21385.1238188446396.JavaMail.root@vms074.mailsrvcs.net> Message-ID: <49D175BA.6050307@FreeBSD.org> Robert Watson wrote: > Part of the point of mapping in the page at execve()-time, or > fork()-time for per-process pages (which I'm not entirely convinced we > need yet) is to avoid the cost of an extra device open, mmap, etc, for > every execve(), which can be quite expensive. I stuck a prototype page You don't really need to do it on every execve() unconditionally. It could be done on demand in libc, so that only when thread pass certain threshold, the "common page optimization code" kicks in and does its open/mmap/etc magic. Otherwise, "normal" syscall is performed. The implementation could be as simple as counter in the appropriate libc routine, so that optimization engages after certain number of calls. For syscalls that return time it's also easy to do frequency thresholds, so that for example gettimeofday() only gets optimized if threads calls it more frequently that 1 call/sec. -Maxim From davidxu at freebsd.org Mon Mar 30 19:38:04 2009 From: davidxu at freebsd.org (David Xu) Date: Mon Mar 30 19:38:17 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <17560ccf0903301432t6a94dd86tb2f2a1a8d6edd7c2@mail.gmail.com> References: <9969.1238398362@critter.freebsd.dk> <49D11D7F.2020503@freebsd.org> <17560ccf0903301432t6a94dd86tb2f2a1a8d6edd7c2@mail.gmail.com> Message-ID: <49D18209.1020805@freebsd.org> Prashant Vaibhav wrote: > ...and that is _exactly_ what I propose(d) in the beginning and what OSX > already does. Further, keeping the shared page and functions fixed at the > end of the memory space has advantages like not needing any special linking, > being easily accessible for code jumps or data reads, and so on [1]. The TSC > issues are but one part of the puzzle. > After this week-long discussion I still can't decide whether this was > something that's desirable at all: keeping in mind that it's among the few > project ideas tagged as "Suggested for Google Summer of Code 2009" on the > FreeBSD website. :-\ Though I've been reading mailing list archives, and > the various handbooks, I'm not familiar well enough with other parts of the > freebsd kernel to draft another concrete proposal on my own at this time. > > [1] *Mac OS X Internals: A Systems Approach,* p 595, Amit Singh, ISBN > 0321278542 > > Without using ELF, but using signal like trampoline code as we current do makes it very difficult for some language to do asynchronous stack unwinding, e.g pthread async cancellation and C++ objection destruction. See my recent work for pthread cancellation and stack unwinding: http://people.freebsd.org/~davidxu/patch/unwind.patch Check x86_64_fallback_frame_state() to see what hacking code should be written. Regards, David Xu From srinivasganji at gmail.com Tue Mar 31 02:04:42 2009 From: srinivasganji at gmail.com (Srinivas Ganji) Date: Tue Mar 31 02:04:48 2009 Subject: Is it possible to use the libthr.a file on a Redhat Linux? Message-ID: Dear All, I have tried to use the libthr.a library for compiling an application which is working fine on Redhat system with libpthread library. However, I end up with the following errors. ../lib/linux/libthr.a(thr_sem.o): In function `_sem_init': thr_sem.c:(.text+0x100): undefined reference to `ksem_init' thr_sem.c:(.text+0x115): undefined reference to `ksem_destroy' ../lib/linux/libthr.a(thr_sem.o): In function `_sem_destroy': thr_sem.c:(.text+0x216): undefined reference to `ksem_destroy' ../lib/linux/libthr.a(thr_sem.o): In function `_sem_timedwait': thr_sem.c:(.text+0x2ad): undefined reference to `ksem_timedwait' ../lib/linux/libthr.a(thr_sem.o): In function `_sem_wait': .... .... .... collect2: ld returned 1 exit status make: *** [target] Error 1 So, I have also mentioned the libc.so.7(This is also a FreeBSD libc library) library in our application to remove the above undefined references. So, at that time I got the following errors. /usr/bin/ld: errno@@FBSD_1.0: TLS definition in /lib/libc.so.6 section .tbss mismatches non-TLS definition in ../lib/linux/libc.so section .bss /lib/libc.so.6: could not read symbols: Bad value Here, the lib/libc.so.6 is a Redhat libc library where as ../lib/linux/libc.so is a FreeBSD library (libc.so.7). My question is: Is it possible to use the FreeBSD libthr.a library on a Redhat Linux distribution? Thanks in advance. With Regards, Srinivas G From wsw1wsw2 at gmail.com Tue Mar 31 02:39:05 2009 From: wsw1wsw2 at gmail.com (Shaowei Wang (wsw)) Date: Tue Mar 31 02:39:11 2009 Subject: Is it possible to use the libthr.a file on a Redhat Linux? In-Reply-To: References: Message-ID: <2e566b9e0903310239x15b53d1av2f45453cb35a8898@mail.gmail.com> On Tue, Mar 31, 2009 at 4:40 PM, Srinivas Ganji wrote: > Dear All, > > I have tried to use the libthr.a library for compiling an application which > is working fine on Redhat system with libpthread library. However, I end up > with the following errors. > > > > ../lib/linux/libthr.a(thr_sem.o): In function `_sem_init': > > thr_sem.c:(.text+0x100): undefined reference to `ksem_init' > > thr_sem.c:(.text+0x115): undefined reference to `ksem_destroy' > > ../lib/linux/libthr.a(thr_sem.o): In function `_sem_destroy': > > thr_sem.c:(.text+0x216): undefined reference to `ksem_destroy' > > ../lib/linux/libthr.a(thr_sem.o): In function `_sem_timedwait': > > thr_sem.c:(.text+0x2ad): undefined reference to `ksem_timedwait' > > ../lib/linux/libthr.a(thr_sem.o): In function `_sem_wait': > > .... > > .... > > .... > > > > collect2: ld returned 1 exit status > > make: *** [target] Error 1 > > > > So, I have also mentioned the libc.so.7(This is also a FreeBSD libc > > library) library in our application to remove the above undefined > references. So, at that time I got the following errors. > > > > /usr/bin/ld: errno@@FBSD_1.0: TLS definition in /lib/libc.so.6 section > .tbss > mismatches non-TLS definition in ../lib/linux/libc.so section .bss > > /lib/libc.so.6: could not read symbols: Bad value > > > > Here, the lib/libc.so.6 is a Redhat libc library where as > ../lib/linux/libc.so is a FreeBSD library (libc.so.7). > > > > My question is: Is it possible to use the FreeBSD libthr.a library on a > Redhat Linux distribution? > As I known, it's not possible unless you port the libthr to Linux system. Linux use clone() system call to implement thread library and FreeBSD use a different way(KSE). > > > > Thanks in advance. > > > > With Regards, > > Srinivas G > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From samflanker at gmail.com Tue Mar 31 07:36:54 2009 From: samflanker at gmail.com (Vladimir Ermakov) Date: Tue Mar 31 07:37:01 2009 Subject: [problem] aac0 does not respond In-Reply-To: <49C8AD9B.7000500@gmail.com> References: <49C8AD9B.7000500@gmail.com> Message-ID: <49D22A83.508@gmail.com> Vladimir Ermakov wrote: > Hello, All > > Describe my problem: > have volume RAID-10 (SAS-HDD x 6) on Adaptec RAID 5805 > 2 HHD of 6 have errors in smart data (damaged) > i am try read file /var/db/mysql/ibdata1 from this volume > system does not respond ( lost access to ssh ) after read 6GB data > from this file > and print debug messages on ttyv0 > > As to prevent the emergence of this problem? > As monitor the status of RAID-controller? > > please, any solutions > > /Vladimir Ermakov > > > > ==========================messages on > ttyv0================================== > Mar 22 20:20:12 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT > AFTER 50 SECONDS > Mar 22 20:20:12 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT > AFTER 50 SECONDS > Mar 22 20:20:12 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT > AFTER 50 SECONDS > Mar 22 20:20:32 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT > AFTER 70 SECONDS > Mar 22 20:20:32 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT > AFTER 70 SECONDS > Mar 22 20:20:32 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT > AFTER 70 SECONDS > Mar 22 20:20:52 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT > AFTER 90 SECONDS > Mar 22 20:20:52 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT > AFTER 90 SECONDS > Mar 22 20:20:52 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT > AFTER 90 SECONDS > Mar 22 20:21:12 df24 kernel: aac0: COMMAND 0xffffffff80859dd0 TIMEOUT > AFTER 111 SECONDS > Mar 22 20:21:12 df24 kernel: aac0: COMMAND 0xffffffff808599e0 TIMEOUT > AFTER 111 SECONDS > Mar 22 20:21:12 df24 kernel: aac0: COMMAND 0xffffffff808569c0 TIMEOUT > AFTER 111 SECONDS > =============================================================== > > > > > # ls -halt /var/db/mysql/ibdata1 > -rw-rw---- 1 88 88 256G Mar 22 23:23 /var/db/mysql/ibdata1 > > # tar -cf - /var/db/mysql/ibdata1 | pv -br > /dev/null > 3.73GB [ 146MB/s] > > > > # smartctl -a -d scsi /dev/pass4 > smartctl version 5.38 [amd64-portbld-freebsd7.1] Copyright (C) 2002-8 > Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > Device: FUJITSU MAX3147RC Version: 0104 > Serial number: xxxxxxxxxxxxxxxxx > Device type: <31> > Transport protocol: SAS > Local Time is: Tue Mar 24 10:07:08 2009 CET > Device supports SMART and is Enabled > Temperature Warning Enabled > SMART Health Status: OK > > Current Drive Temperature: 21 C > Drive Trip Temperature: 65 C > Manufactured in week 18 of year 2006 > Recommended maximum start stop count: 10000 times > Current start stop count: 46 times > > Error counter log: > Errors Corrected by Total Correction > Gigabytes Total > ECC rereads/ errors algorithm > processed uncorrected > fast | delayed rewrites corrected invocations [10^9 > bytes] errors > read: 0 75782 1488 0 0 > 31950.874 1488 > write: 0 567 0 0 0 > 12148.416 0 > verify: 0 17642 960 0 0 > 10148.962 960 > > > > # uname -a > FreeBSD sys3 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #1: Mon Nov 3 > 18:39:49 UTC 2008 root@sys3:/usr/obj/usr/src/sys/SYS3 amd64 > > # pciconf -lvc > *** > aac0@pci0:10:0:0: class=0x010400 card=0x02b69005 chip=0x02859005 > rev=0x09 hdr=0x00 > vendor = 'Adaptec Inc' > device = 'AAC-RAID RAID Controller' > class = mass storage > subclass = RAID > cap 01[98] = powerspec 2 supports D0 D1 D3 current D0 > cap 05[a0] = MSI supports 2 messages, 64 bit > cap 10[d0] = PCI-Express 1 endpoint > cap 03[90] = VPD > *** > > # dmesg | grep aac0 > aac0: mem 0xb8a00000-0xb8bfffff irq 16 at device > 0.0 on pci10 > aac0: Enabling 64-bit address support > aac0: Enable Raw I/O > aac0: Enable 64-bit array > aac0: New comm. interface enabled > aac0: [ITHREAD] > aac0: Adaptec 5805, aac driver 2.0.0-1 > aacp0: on aac0 > aacp1: on aac0 > aacp2: on aac0 > aacd0: on aac0 > > > tried boot FreeBSD 7.1 i386 system and read file from volume ---------------------------------------------------- # tar -cf - /var/db/mysql/ibdata1 | pv -br > /dev/null 256GB [ 208MB/s] # echo $? 0 # ---------------------------------------------------- without problem (controller does not freeze) please help with FreeBSD 7.1 amd64 /Vladimir Ermakov From peterjeremy at optushome.com.au Tue Mar 31 12:02:28 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Tue Mar 31 12:02:36 2009 Subject: Improving the kernel/i386 timecounter performance (GSoC proposal) In-Reply-To: <49D1725A.1020005@FreeBSD.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <20090329182219.GC38985@server.vk2pj.dyndns.org> <49D1725A.1020005@FreeBSD.org> Message-ID: <20090331190222.GA2816@server.vk2pj.dyndns.org> On 2009-Mar-30 18:45:30 -0700, Maxim Sobolev wrote: >You don't really need to do it on every execve() unconditionally. It >could be done on demand in libc, so that only when thread pass certain >threshold, the "common page optimization code" kicks in and does its >open/mmap/etc magic. Otherwise, "normal" syscall is performed. This "optimisation" is premature. First step is to implement an approach that always maps (or whatever) the data and then gather some information about its overheads in the real world. If they are deemed excessive, only then do we start looking at how to improve things. And IMO, the first step would be to lazily map the page - so it's not mapped by default but mapped the first time any of the information in it is used. >that for example gettimeofday() only gets optimized if threads calls it >more frequently that 1 call/sec. Whilst this thread started talking about timecounters, once you have a shared page, there is a variety of other information that could be exported - PID being the most obvious. If the page is exported as code rather than data (as has been suggested) then you also have the possibility of exporting CPU-dependent optimised versions of some library functions (ala Solaris). The more stuff you export, the less you gain from supporting an export threshold. On 2009-Mar-30 18:31:06 -0700, Maxim Sobolev wrote: >It's not that easy, unless you can pin thread to a specific core before >reading that page. I.e. imagine the case when your thread reads per-cpu >page, get preempted and scheduled to a different core, then executes >RDTSC there, still thinking it got TSC reading from the first core. Even >if it does re-read from that page again after reading TSC to determine >if he has read the correct TSC, still it's possible (though not very >likely) that it has been preempted again and scheduled to the first core >after reading the TSC. Good point. If you export code, rather than data, then the scheduler can just special-case threads where the return address is inside the magic page (this is a fairly cheap test and only needs to occur once you have decided to re-schedule that thread - so you are already in the "expensive" part of the scheduler and a few more instructions won't be noticable there). The most obvious approach would be to temporarily pin the thread whilst it's executing inside that page. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090331/761de4cb/attachment.pgp From julian at elischer.org Tue Mar 31 14:46:55 2009 From: julian at elischer.org (Julian Elischer) Date: Tue Mar 31 14:47:03 2009 Subject: Is it possible to use the libthr.a file on a Redhat Linux? In-Reply-To: <2e566b9e0903310239x15b53d1av2f45453cb35a8898@mail.gmail.com> References: <2e566b9e0903310239x15b53d1av2f45453cb35a8898@mail.gmail.com> Message-ID: <49D28F67.7010802@elischer.org> Shaowei Wang (wsw) wrote: > On Tue, Mar 31, 2009 at 4:40 PM, Srinivas Ganji wrote: > >> Dear All, >> >> I have tried to use the libthr.a library for compiling an application which >> is working fine on Redhat system with libpthread library. However, I end up >> with the following errors. >> >> >> >> ../lib/linux/libthr.a(thr_sem.o): In function `_sem_init': >> >> thr_sem.c:(.text+0x100): undefined reference to `ksem_init' >> >> thr_sem.c:(.text+0x115): undefined reference to `ksem_destroy' >> >> ../lib/linux/libthr.a(thr_sem.o): In function `_sem_destroy': >> >> thr_sem.c:(.text+0x216): undefined reference to `ksem_destroy' >> >> ../lib/linux/libthr.a(thr_sem.o): In function `_sem_timedwait': >> >> thr_sem.c:(.text+0x2ad): undefined reference to `ksem_timedwait' >> >> ../lib/linux/libthr.a(thr_sem.o): In function `_sem_wait': >> >> .... >> >> .... >> >> .... >> >> >> >> collect2: ld returned 1 exit status >> >> make: *** [target] Error 1 >> >> >> >> So, I have also mentioned the libc.so.7(This is also a FreeBSD libc >> >> library) library in our application to remove the above undefined >> references. So, at that time I got the following errors. >> >> >> >> /usr/bin/ld: errno@@FBSD_1.0: TLS definition in /lib/libc.so.6 section >> .tbss >> mismatches non-TLS definition in ../lib/linux/libc.so section .bss >> >> /lib/libc.so.6: could not read symbols: Bad value >> >> >> >> Here, the lib/libc.so.6 is a Redhat libc library where as >> ../lib/linux/libc.so is a FreeBSD library (libc.so.7). >> >> >> >> My question is: Is it possible to use the FreeBSD libthr.a library on a >> Redhat Linux distribution? >> > > As I known, it's not possible unless you port the libthr to Linux system. > > Linux use clone() system call to implement thread library and FreeBSD use a > different way(KSE). no, KSE was an experimental system that was removed.. FreeBSD threads are created using the thr_create() call There is some siliarity to the way that libthr and linux make threads as they are both 1:1 models. > > >> >> >> Thanks in advance. >> >> >> >> With Regards, >> >> Srinivas G >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" From delphij at delphij.net Tue Mar 31 17:05:15 2009 From: delphij at delphij.net (Xin LI) Date: Tue Mar 31 17:05:30 2009 Subject: Is it possible to use the libthr.a file on a Redhat Linux? In-Reply-To: References: Message-ID: <49D2AFA9.7080707@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Srinivas Ganji wrote: > Dear All, > > I have tried to use the libthr.a library for compiling an application which > is working fine on Redhat system with libpthread library. However, I end up > with the following errors. > [...] > My question is: Is it possible to use the FreeBSD libthr.a library on a > Redhat Linux distribution? I don't think so. libthr depends on some features that only exists on FreeBSD, like other system libraries, they wrap FreeBSD kernel interfaces to what is more familiar to application programmers, like C and POSIX APIs, etc. It should be noted that it could be possible if you recompile your application under RedHat Linux, as the upper layer of API should be more similar. Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAknSr6kACgkQi+vbBBjt66AiLACePPXunI2ApOoJ3OSLZKfpZWg2 m1sAoLPrnqOavIV0ldM1+D334JMuaQCs =akOZ -----END PGP SIGNATURE----- From delphij at delphij.net Tue Mar 31 19:04:48 2009 From: delphij at delphij.net (Xin LI) Date: Tue Mar 31 19:27:30 2009 Subject: Intel Integrated Raid (iir) relevance In-Reply-To: References: Message-ID: <49D2CBB1.3070802@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 (It would be probably good idea to redirect this discussion to -stable@, redirected) Hi, Danny, Danny Braniss wrote: > It's no longer working (for me) under 7.2, and so far > I am not getting any feedback, so since it seems that > this particular hardware has reached EOL, I was wondering > if, > a) it's true, > b) drop it, and replace it. > c) should time be spent in getting it to work again. I'm not very sure about your problem with iir(4). A diff against RELENG_7_1 does not reveal any change on the driver itself. Are you sure that 7.1-R can have the device working? Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAknSy7AACgkQi+vbBBjt66AUoQCgtFiu6Bsg0LygJ7gAnKLdBBMN JKIAoKNioqTEQSA8vX621jqTpBKTaO1C =RmFa -----END PGP SIGNATURE-----