From if at xip.at Sat Mar 1 01:03:20 2008 From: if at xip.at (Ingo Flaschberger) Date: Sat Mar 1 01:10:52 2008 Subject: FBSD 1GBit router? In-Reply-To: <47C8964C.9080309@digiware.nl> References: <20080226003107.54CD94500E@ptavv.es.net> <47C8964C.9080309@digiware.nl> Message-ID: >> I have a 1.2Ghz Pentium-M appliance, with 4x 32bit, 33MHz pci intel e1000 >> cards. >> With maximum tuning I can "route" ~400mbps with big packets and ~80mbps >> with 64byte packets. >> around 100kpps, whats not bad for a pci architecture. >> >> To reach higher bandwiths, better busses are needed. >> pci-express cards are currently the best choice. >> one dedicated pci-express lane (1.25gbps) has more bandwith than a whole >> 32bit, 33mhz pci-bus. > > Like you say routing 400 Mb/s is close to the max of the PCI bus, which > has a theoretical max of 33*4*8 ~ 1Gbps. Now routing is 500Mb/s in, 500Mb/s > out. So you are within 80% of the bus-max, not counting memory-access and > others. yes. > PCI express will give you a bus per PCI-E device into a central hub, thus > upping the limit to the speed of the FrontSideBus in Intel architectures. > Which at the moment is a lot higher than what a single PCI bus does. Thats why my next router will be based at this box: http://www.axiomtek.com/products/ViewProduct.asp?view=429 Hopefully there will be direct memory bus connected nic's in future. (HyperTransport connected nic's) > What it does not explain is why you can only get 80Mb/s with 64byte packets, > which would suggest other bottlenecks than just the bus. Perhaps something with interrupts: http://books.google.at/books?id=pr4fspaQqZkC&pg=PA144&lpg=PA144&dq=pci+interrupt+delay&source=web&ots=zbvVU2CgVx&sig=APe9YjdtK35ccnow7BDI2hzie7s&hl=de#PPA144,M1 MSI (Message-signalled Interrupts) are not very common on PCI architekture; PCI-E use only MSI. The kpps keept always around 100, equally if I used fast-forwarding, fast-interrupts, or higher HZ values than 1000HZ. But 100kpps is great for a router hardware of about 600eur. Kind regards, Ingo Flaschberger From adrian at freebsd.org Sat Mar 1 02:06:35 2008 From: adrian at freebsd.org (Adrian Chadd) Date: Sat Mar 1 02:06:41 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> Message-ID: On 01/03/2008, Chris wrote: > You working round what I just said. A nic should perform equally well > as it does in other operating systems just because its cheaper its not > an excuse for buggy performance. There is also other good network > cards apart from intel pro 1000. I am talking about stability not > performance, I expect a intel pro 1000 to outperform a realtek however > I expect both to be stable in terms of connectivity. I expect a > realtek in freebsd to perform as well as a realtek in windows and > linux. :) Patches please! Adrian -- Adrian Chadd - adrian@freebsd.org From chrcoluk at gmail.com Sat Mar 1 02:21:00 2008 From: chrcoluk at gmail.com (Chris) Date: Sat Mar 1 02:21:04 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> Message-ID: <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> On 01/03/2008, Adrian Chadd wrote: > On 01/03/2008, Chris wrote: > > > You working round what I just said. A nic should perform equally well > > as it does in other operating systems just because its cheaper its not > > an excuse for buggy performance. There is also other good network > > cards apart from intel pro 1000. I am talking about stability not > > performance, I expect a intel pro 1000 to outperform a realtek however > > I expect both to be stable in terms of connectivity. I expect a > > realtek in freebsd to perform as well as a realtek in windows and > > linux. :) > > Patches please! > > > Adrian > > > -- > Adrian Chadd - adrian@freebsd.org > Ironically the latest server I got last night has a intel pro 1000 a rarity :) I am just giving feedback as when I speak to people in the datacentre and hosting business the biggest gripe with freebsd is hardware compatability, as I adore freebsd I ignore this and work round it but its defenitly reducing take up. Of course I know current re issues are getting attention which I am thankful for, I fully understand the time and effort required to write drivers patches etc. and have got no critisicms for the people who do this my complaint is more focused on people claiming there is no issues its just the hardware. Thanks Chris From wjw at digiware.nl Sat Mar 1 11:31:35 2008 From: wjw at digiware.nl (Willem Jan Withagen) Date: Sat Mar 1 12:30:15 2008 Subject: FBSD 1GBit router? In-Reply-To: References: <20080226003107.54CD94500E@ptavv.es.net> <47C8964C.9080309@digiware.nl> Message-ID: <47C93E8B.3010609@digiware.nl> Ingo Flaschberger wrote: > >>> I have a 1.2Ghz Pentium-M appliance, with 4x 32bit, 33MHz pci >>> intel e1000 cards. With maximum tuning I can "route" ~400mbps >>> with big packets and ~80mbps with 64byte packets. around 100kpps, >>> whats not bad for a pci architecture. >>> >>> To reach higher bandwiths, better busses are needed. pci-express >>> cards are currently the best choice. one dedicated pci-express >>> lane (1.25gbps) has more bandwith than a whole 32bit, 33mhz >>> pci-bus. >> >> Like you say routing 400 Mb/s is close to the max of the PCI bus, >> which has a theoretical max of 33*4*8 ~ 1Gbps. Now routing is >> 500Mb/s in, 500Mb/s out. So you are within 80% of the bus-max, not >> counting memory-access and others. > > yes. > >> PCI express will give you a bus per PCI-E device into a central >> hub, thus upping the limit to the speed of the FrontSideBus in >> Intel architectures. Which at the moment is a lot higher than what >> a single PCI bus does. > > Thats why my next router will be based at this box: > http://www.axiomtek.com/products/ViewProduct.asp?view=429 Nice piece of hardware. Don't like the 2.5" one disk option though. And not shure what to think of: "Seven 10/100/1000Mbps (through PCI-E by one interface) ports (RJ-45)" Which seems to suggest everything comes in thru on PCI-E interface. That than better have 8 or 16 lanes. > Hopefully there will be direct memory bus connected nic's in future. > (HyperTransport connected nic's) Well that is going to be an AMD only solution, and I'm not even shure that AMD would like to have other things than CPU's on that bus. > >> What it does not explain is why you can only get 80Mb/s with 64byte >> packets, which would suggest other bottlenecks than just the bus. > > Perhaps something with interrupts: > http://books.google.at/books?id=pr4fspaQqZkC&pg=PA144&lpg=PA144&dq=pci+interrupt+delay&source=web&ots=zbvVU2CgVx&sig=APe9YjdtK35ccnow7BDI2hzie7s&hl=de#PPA144,M1 > > > > MSI (Message-signalled Interrupts) are not very common on PCI > architekture; PCI-E use only MSI. > > The kpps keept always around 100, equally if I used fast-forwarding, > fast-interrupts, or higher HZ values than 1000HZ. MSI is not used for regular PCI busses.Could be that PCI-E does use it. I believe youon that. But even than I'd like to know where the bottleneck is in the 100kp/s limit with 64byte pakkets. > But 100kpps is great for a router hardware of about 600eur. I've seen routers 10 times that expensive, not able to that. --WjW From brueffer at FreeBSD.org Sat Mar 1 12:45:12 2008 From: brueffer at FreeBSD.org (Christian Brueffer) Date: Sat Mar 1 13:02:52 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> Message-ID: <20080306013736.GD1500@haakonia.hitnet.RWTH-Aachen.DE> On Sat, Mar 01, 2008 at 02:20:58AM +0000, Chris wrote: > On 01/03/2008, Adrian Chadd wrote: > > On 01/03/2008, Chris wrote: > > > > > You working round what I just said. A nic should perform equally well > > > as it does in other operating systems just because its cheaper its not > > > an excuse for buggy performance. There is also other good network > > > cards apart from intel pro 1000. I am talking about stability not > > > performance, I expect a intel pro 1000 to outperform a realtek however > > > I expect both to be stable in terms of connectivity. I expect a > > > realtek in freebsd to perform as well as a realtek in windows and > > > linux. :) > > > > Patches please! > > > > > > Adrian > > > > > > -- > > Adrian Chadd - adrian@freebsd.org > > > > Ironically the latest server I got last night has a intel pro 1000 a rarity :) > > I am just giving feedback as when I speak to people in the datacentre > and hosting business the biggest gripe with freebsd is hardware > compatability, as I adore freebsd I ignore this and work round it but > its defenitly reducing take up. > > Of course I know current re issues are getting attention which I am > thankful for, I fully understand the time and effort required to write > drivers patches etc. and have got no critisicms for the people who do > this my complaint is more focused on people claiming there is no > issues its just the hardware. > Pyun YongHyeon has fixed a lot of driver issues (i.e. re(4), bfr(4), vr(4)) over the last few months, many are already in CURRENT or RELENG_7 (not sure how many of them made it into 7.0-RELEASE) or posted as patches to the current@ mailing list. If you have problems, please see if they persist with a CURRENT snapshot. If they do, please post to the current@ mailing list with details. - Christian -- Christian Brueffer chris@unixpages.org brueffer@FreeBSD.org GPG Key: http://people.freebsd.org/~brueffer/brueffer.key.asc GPG Fingerprint: A5C8 2099 19FF AACA F41B B29B 6C76 178C A0ED 982D -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080301/b08027a8/attachment.pgp From if at xip.at Sat Mar 1 15:08:06 2008 From: if at xip.at (Ingo Flaschberger) Date: Sat Mar 1 15:20:30 2008 Subject: FBSD 1GBit router? In-Reply-To: <47C93E8B.3010609@digiware.nl> References: <20080226003107.54CD94500E@ptavv.es.net> <47C8964C.9080309@digiware.nl> <47C93E8B.3010609@digiware.nl> Message-ID: >> Thats why my next router will be based at this box: >> http://www.axiomtek.com/products/ViewProduct.asp?view=429 > > Nice piece of hardware. > Don't like the 2.5" one disk option though. > > And not shure what to think of: > "Seven 10/100/1000Mbps (through PCI-E by one > interface) ports (RJ-45)" > Which seems to suggest everything comes in thru on PCI-E interface. > That than better have 8 or 16 lanes. Each 1000Mbps port is connected via 1 lane PCI-E, which is fast enough. 1 lane: 250Mbyte/sec -> 2Gpbs >> Hopefully there will be direct memory bus connected nic's in future. >> (HyperTransport connected nic's) > > Well that is going to be an AMD only solution, and I'm not even shure > that AMD would like to have other things than CPU's on that bus. > >> >>> What it does not explain is why you can only get 80Mb/s with 64byte >>> packets, which would suggest other bottlenecks than just the bus. >> >> Perhaps something with interrupts: >> http://books.google.at/books?id=pr4fspaQqZkC&pg=PA144&lpg=PA144&dq=pci+interrupt+delay&source=web&ots=zbvVU2CgVx&sig=APe9YjdtK35ccnow7BDI2hzie7s&hl=de#PPA144,M1 >> >> >> >> MSI (Message-signalled Interrupts) are not very common on PCI architekture; >> PCI-E use only MSI. >> >> The kpps keept always around 100, equally if I used fast-forwarding, >> fast-interrupts, or higher HZ values than 1000HZ. > > MSI is not used for regular PCI busses.Could be that PCI-E does use it. > I believe youon that. But even than I'd like to know where the bottleneck is > in the 100kp/s limit with 64byte pakkets. As I also tested with polling (currently I use interface polling for the router) and also reached only 100kpps, the bottleneck must be someting different. >> But 100kpps is great for a router hardware of about 600eur. > > I've seen routers 10 times that expensive, not able to that. me too. Kind regards, Ingo Flaschberger From arkadi at mebius.lv Sat Mar 1 22:15:15 2008 From: arkadi at mebius.lv (Arkadi Shishlov) Date: Sat Mar 1 22:15:31 2008 Subject: PHP with open_basedir performance problem In-Reply-To: <20080227082605.GL51827@dracon.ht-systems.ru> References: <479B1185.8020604@quip.cz> <479D89C9.7060300@chistydom.ru> <479DD94C.7010409@mawer.org> <479DE578.7060202@quip.cz> <20080214163037.GA51014@dracon.ht-systems.ru> <47B478E6.8080902@mebius.lv> <20080227082605.GL51827@dracon.ht-systems.ru> Message-ID: <47C9CC12.1090509@mebius.lv> Stanislav Sedov wrote: > On Thu, Feb 14, 2008 at 07:22:46PM +0200 Arkadi Shishlov mentioned: >> Stanislav Sedov wrote: >>> Most basedir problems are linked with the fact it produce a lot of lstast/ >>> readlinks on every require, include or open command. On Linux it pereforms >>> even worse, as they implemented readlink there by hand, and, of course, >>> their implementation isn't particulry good. >> But there is no high sys cpu usage on Linux in contrary to FreeBSD, as >> reported by original author of the thread..? >> Do you have numbers or benchmark ready? I see the number of syscalls >> required is astonishing (on Linux) but doesn't cause any problem at first >> look. > > I don't have specific benchmark numbers, and it's true, that top on Linux > don't show such sys time usage, as on FreeBSD boxes. However, the overall > performance of boxes on FreeBSD is 30-40% higher, that Linux ones. This numbers > is empirical, but I'm pretty sure in them: in past I migrated Linux hosting > to FreeBSD-based, and after that, I was able to add a bunch of new users to > that boxes without performance impact. In fact, the load average on these > boxes are MUCH lower, that was on Linux. Also, I notices, that stat() costs > much more on Linux, that FreeBSD. I don't certainly know, why Linux shows low > sys time usage, probably it's just bugs in accounting. I can confirm the FreeBSD was significantly faster than Linux in the open_basedir test I just conducted. With open_basedir check enabled, FreeBSD throughput dropped 2x, Linux 3x, and FreeBSD is 2x faster than Linux in this situation. The test system is Pentium4 3.8GHz HT, 2MB cache, 2.5GB RAM. FreeBSD 7.0-RELEASE i386. Linux kernel 2.6.24.2 i386. Both kernels are SMP. Software is lighttpd 1.4.18, PHP 5.2.5 in FastCGI mode, without op-code cache. The index.php that was tested by ApacheBench is a do-nothing script, that just includes other scripts (in sub-dir.), that in turn include other scripts, bringing total count of includes to 25 - like in a typical PHP application. Website document root depth is 4 (/usr/local/www/data). I've varied test parameters and filesystem setup (tmpfs, mdmfs), but in overall the picture is: http | no open_basedir | with open_basedir response size> 25kB | 50B | 50B -------------+----------+----------+------------------ FreeBSD | 192/125 | 243/ 89 | 99/247 Linux | 165/116 | 152/126 | 50/382 [Requests per second / 99% of requests served within N ms] ApacheBench concurency level is 15. 10 FastCGI processes. TOP shows approximatelly 20% user / 80% system time split for both Linux and FreeBSD in all tests (so accounting is likely correct on Linux). From rwatson at FreeBSD.org Sun Mar 2 20:33:39 2008 From: rwatson at FreeBSD.org (Robert Watson) Date: Sun Mar 2 20:34:59 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> Message-ID: <20080302203059.J31090@fledge.watson.org> On Sat, 1 Mar 2008, Chris wrote: > Ironically the latest server I got last night has a intel pro 1000 a rarity > :) > > I am just giving feedback as when I speak to people in the datacentre and > hosting business the biggest gripe with freebsd is hardware compatability, > as I adore freebsd I ignore this and work round it but its defenitly > reducing take up. > > Of course I know current re issues are getting attention which I am thankful > for, I fully understand the time and effort required to write drivers > patches etc. and have got no critisicms for the people who do this my > complaint is more focused on people claiming there is no issues its just the > hardware. It's no coincidence that Intel cards work quite well with FreeBSD, given that Intel has hired developers to make FreeBSD work well on their cards. The same goes for companies like Broadcom, Chelsio, Neterion, etc, who provide not only the necessary documentation, but also put development resources into writing and QAing drivers. Put pressure on your hardware providers to do the same thing for their hardware -- one or two people asking may not do the trick, but a few large customers beating on their sales engineers can make a big difference, and so can larger numbers of smaller customers. Robert N M Watson Computer Laboratory University of Cambridge From algardo at sura.ru Mon Mar 3 05:33:34 2008 From: algardo at sura.ru (Aleksey Perov) Date: Mon Mar 3 05:33:38 2008 Subject: Upgrade from 6.3 to 7.0: Result ->Shared object not found :) In-Reply-To: <47C826E0.7070903@moneybookers.com> References: <47C59591.6040600@errno.com> <47C6991E.1050502@FreeBSD.org> <01ce01c87adc$e2796720$a76c3560$@lv> <47C817B0.1040602@FreeBSD.org> <01d201c87ae4$4b902850$e2b078f0$@lv> <47C82092.5040504@FreeBSD.org> <01d301c87ae7$7e2b2550$7a816ff0$@lv> <47C826E0.7070903@moneybookers.com> Message-ID: <20080303083331.f9eab113.algardo@sura.ru> Stefan Lambrev wrote: > Also you can use script(1) to log, so you can analyze the output latter. > I'm not sure how script will run, if it is started in screen > (ports/sysutils/screen) Works just fine for me: screen script portupgrade.log portupgrade -a -- Aleksey From freebsd at sopwith.solgatos.com Mon Mar 3 06:06:34 2008 From: freebsd at sopwith.solgatos.com (Dieter) Date: Mon Mar 3 06:06:39 2008 Subject: FBSD 1GBit router? In-Reply-To: Your message of "Sat, 01 Mar 2008 12:31:23 +0100." <47C93E8B.3010609@digiware.nl> Message-ID: <200803030601.GAA01816@sopwith.solgatos.com> > > Hopefully there will be direct memory bus connected nic's in future. > > (HyperTransport connected nic's) > > Well that is going to be an AMD only solution, and I'm not even shure > that AMD would like to have other things than CPU's on that bus. There are FPGAs that plug into a CPU socket (for mainboards with multiple CPU sockets). There will be GPUs on the HyperTransport bus. Putting a network controller there seems a tad extreme. From tk at webmatic.de Mon Mar 3 09:18:25 2008 From: tk at webmatic.de (Thomas Krause (Webmatic)) Date: Mon Mar 3 09:18:30 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <20080306013736.GD1500@haakonia.hitnet.RWTH-Aachen.DE> References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> <20080306013736.GD1500@haakonia.hitnet.RWTH-Aachen.DE> Message-ID: <47CBBC46.8080202@webmatic.de> >> > > Pyun YongHyeon has fixed a lot of driver issues (i.e. re(4), bfr(4), vr(4)) > over the last few months, many are already in CURRENT or RELENG_7 (not > sure how many of them made it into 7.0-RELEASE) or posted as patches > to the current@ mailing list. My question is: which other PCI-Express GBit NIC's then Intel's are available on the market? I can't find others ... Best regards, Thomas. From freebsd at sopwith.solgatos.com Mon Mar 3 16:44:42 2008 From: freebsd at sopwith.solgatos.com (Dieter) Date: Mon Mar 3 16:44:45 2008 Subject: PCIe vs PCI (was: Re: FreeBSD bind performance in FreeBSD 7) In-Reply-To: Your message of "Mon, 03 Mar 2008 09:52:22 +0100." <47CBBC46.8080202@webmatic.de> Message-ID: <200803031642.QAA05460@sopwith.solgatos.com> > My question is: which other PCI-Express GBit NIC's then Intel's are > available on the market? I can't find others ... "man -k pcie" on 6.2 gives: bce(4) - Broadcom NetXtreme II (BCM5706/BCM5708) PCI/PCIe Gigabit Ethernet adapter driver re(4) - RealTek 8139C+/8169/816xS/811xS/8101E PCI/PCIe Ethernet adapter driver There might be more in 7.0 but it is still downloading. :-( And there may be PCIe devices that don't show up in man -k. Is there a way to tell from dmesg or pciconf that something is PCIe rather than PCI? The onboard stuff could be either. From zbeeble at gmail.com Mon Mar 3 17:22:15 2008 From: zbeeble at gmail.com (Zaphod Beeblebrox) Date: Mon Mar 3 17:22:18 2008 Subject: PCIe vs PCI (was: Re: FreeBSD bind performance in FreeBSD 7) In-Reply-To: <200803031642.QAA05460@sopwith.solgatos.com> References: <47CBBC46.8080202@webmatic.de> <200803031642.QAA05460@sopwith.solgatos.com> Message-ID: <5f67a8c40803030922w2ec6455ekf9de6da6fb2e571d@mail.gmail.com> On Mon, Mar 3, 2008 at 8:42 AM, Dieter wrote: > > My question is: which other PCI-Express GBit NIC's then Intel's are > > available on the market? I can't find others ... > > "man -k pcie" on 6.2 gives: AFAIK, PCIe devices show up as PCI devices to most drivers. I know that 'em' flavor drivers work fine with PCIe and that 'ath' driver also does (in this case, the ath is an 'express card' ... which I'm told is another flavor of PCIe). I think the only oddball thing about PCIe is that some flavours of PCIe can show up as a USB device rather than a PCI device. From brueffer at FreeBSD.org Mon Mar 3 16:59:31 2008 From: brueffer at FreeBSD.org (Christian Brueffer) Date: Mon Mar 3 17:46:31 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <47CBBC46.8080202@webmatic.de> References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> <20080306013736.GD1500@haakonia.hitnet.RWTH-Aachen.DE> <47CBBC46.8080202@webmatic.de> Message-ID: <20080303165928.GB1479@haakonia.hitnet.RWTH-Aachen.DE> On Mon, Mar 03, 2008 at 09:52:22AM +0100, Thomas Krause (Webmatic) wrote: > >> > > > >Pyun YongHyeon has fixed a lot of driver issues (i.e. re(4), bfr(4), vr(4)) > >over the last few months, many are already in CURRENT or RELENG_7 (not > >sure how many of them made it into 7.0-RELEASE) or posted as patches > >to the current@ mailing list. > > My question is: which other PCI-Express GBit NIC's then Intel's are > available on the market? I can't find others ... > Googling for "pci express gigabit ethernet" gives quite a few hits. - Christian -- Christian Brueffer chris@unixpages.org brueffer@FreeBSD.org GPG Key: http://people.freebsd.org/~brueffer/brueffer.key.asc GPG Fingerprint: A5C8 2099 19FF AACA F41B B29B 6C76 178C A0ED 982D -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080303/70adf551/attachment.pgp From mike at sentex.net Mon Mar 3 19:57:57 2008 From: mike at sentex.net (Mike Tancsa) Date: Mon Mar 3 19:58:02 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <47CBBC46.8080202@webmatic.de> References: <47C59591.6040600@errno.com> <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> <1204302128.2126.150.camel@localhost> <3aaaa3a0802290854t639559b6if0adc4009997e9db@mail.gmail.com> <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> <20080306013736.GD1500@haakonia.hitnet.RWTH-Aachen.DE> <47CBBC46.8080202@webmatic.de> Message-ID: <200803031957.m23JvthF090729@lava.sentex.ca> At 03:52 AM 3/3/2008, Thomas Krause (Webmatic) wrote: >>Pyun YongHyeon has fixed a lot of driver issues (i.e. re(4), bfr(4), vr(4)) >>over the last few months, many are already in CURRENT or RELENG_7 (not >>sure how many of them made it into 7.0-RELEASE) or posted as patches >>to the current@ mailing list. > >My question is: which other PCI-Express GBit NIC's then Intel's are >available on the market? I can't find others ... I have used a few bge nics that are PCIe.... However, I suggest stick with the Intel for now. My home box has a PCIe bge nic. It works fine for my home server on RELENG_7 (Samba, nfs). bge0@pci0:2:0:0: class=0x020000 card=0x167714e4 chip=0x167714e4 rev=0x01 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5750A1 NetXtreme Gigabit Ethernet PCI Express' class = network subclass = ethernet cap 01[48] = powerspec 2 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit cap 10[d0] = PCI-Express 1 endpoint bge0: mem 0xfddf0000-0xfddfffff irq 18 at device 0.0 on pci2 miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: 00:10:18:14:15:43 bge0: [ITHREAD] ---Mike From freebsd at sopwith.solgatos.com Mon Mar 3 22:38:59 2008 From: freebsd at sopwith.solgatos.com (Dieter) Date: Mon Mar 3 22:39:03 2008 Subject: PCIe vs PCI (was: Re: FreeBSD bind performance in FreeBSD 7) In-Reply-To: Your message of "Mon, 03 Mar 2008 08:42:03 GMT." Message-ID: <200803032134.VAA08472@sopwith.solgatos.com> > Is there a way to tell from dmesg or pciconf that something is > PCIe rather than PCI? The onboard stuff could be either. The secret appears to be pciconf -l -v | grep -i express device = 'BCM5750A1 NetXtreme Gigabit Ethernet PCI Express' This bge(4) Broadcom chip had problems in 6.0 but works well in 6.2. The critical part for my application is not getting the maximum number of packets per second, but in not dropping any, and getting them acked rapidly. The closed source "black box" on the other end of the wire has a buggy network stack, a *way* too small transmit buffer, and generates data in real time that I have only one chance to capture. The remaining problem is that other device drivers can lock it out. For example: http://www.freebsd.org/cgi/query-pr.cgi?pr=118093 From tedm at toybox.placo.com Tue Mar 4 04:45:44 2008 From: tedm at toybox.placo.com (Ted Mittelstaedt) Date: Tue Mar 4 04:45:56 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <3aaaa3a0802291820j58a24de7wb39ebf2a2653f579@mail.gmail.com> Message-ID: > -----Original Message----- > From: owner-freebsd-questions@freebsd.org > [mailto:owner-freebsd-questions@freebsd.org]On Behalf Of Chris > Sent: Friday, February 29, 2008 6:21 PM > To: Adrian Chadd > Cc: freebsd-performance@freebsd.org; freebsd-questions@freebsd.org > Subject: Re: FreeBSD bind performance in FreeBSD 7 > > > On 01/03/2008, Adrian Chadd wrote: > > On 01/03/2008, Chris wrote: > > > > > You working round what I just said. A nic should perform equally well > > > as it does in other operating systems just because its > cheaper its not > > > an excuse for buggy performance. There is also other good network > > > cards apart from intel pro 1000. I am talking about stability not > > > performance, I expect a intel pro 1000 to outperform a > realtek however > > > I expect both to be stable in terms of connectivity. I expect a > > > realtek in freebsd to perform as well as a realtek in windows and > > > linux. :) > > > > Patches please! > > > > > > Adrian > > > > > > -- > > Adrian Chadd - adrian@freebsd.org > > > > Ironically the latest server I got last night has a intel pro > 1000 a rarity :) > > I am just giving feedback as when I speak to people in the datacentre > and hosting business the biggest gripe with freebsd is hardware > compatability, as I adore freebsd I ignore this and work round it but > its defenitly reducing take up. > > Of course I know current re issues are getting attention which I am > thankful for, I fully understand the time and effort required to write > drivers patches etc. and have got no critisicms for the people who do > this my complaint is more focused on people claiming there is no > issues its just the hardware. > There aren't issues on hardware that is compatible. You can't run MacOS X on an off-the-shelf PC and nobody complains about it. You can't run Solaris for the Sparc on an Intel box but nobody complains about it. FreeBSD is not Java, it is not "write once, run anywhere" If there is any problem with FreeBSD in this respect is that it supports the poor hardware AT ALL. Of course, we can't do much about that - a code contributor who gets access to CVS can put anything they want into the FreeBSD source, and drivers are a particular problem - since few developers are going to have duplicates of the hardware, only the contributing developer really knows if his driver is solid or not. Arguably it might be better to drop support for poor hardware, then the people who had such hardware would not be tempted to run FreeBSD - thereby having a bad experience with it, and blaming FreeBSD about it. I challenge you to find an example of very high quality hardware that has a driver in FreeBSD that has a lot of problems. Yet, you can find a lot of poor quality hardware that has a FreeBSD driver with a lot of problems. That should tell you something - that the issue for the poor hardware really is "just the hardware" The people complaining about hardware compatibility need to pull their heads out. If they are buying brand new systems they are utter fools if they don't check out in advance what works and what doesen't. It's not like there's a shortage of experienced people on this list who could tell them what to buy. And if after the fact they find out their shiny new PC won't run FreeBSD - then they take it back to the retailer and exchange it for a different model. Why is this so difficult? My beef with the DNS tests was that ISC ran out and bought the hardware FIRST, -then- they started testing. This is directly contrary to every bit of advice ever given in the computer industry for the last 50 years - you select the software FIRST, -then- you buy the hardware that runs it. In short, it said far more about the incompetence of the testers than the shortcomings of the software. The people who have USED systems who are bitching about FreeBSD not being compatible with their stuff need to get over it. OK, so they didn't get a chance to select the hardware, they are using some retired Windows box that won't run the new version of Windows. So they come here and our stuff has a problem with some hardware part. Well, OK fine - how does this hurt them? Their old computer wasn't usable for Windows anymore, now was it? In short, their computer at that point was worthless - and why is it OUR responsibility to make our stuff compatible with their old computer? How does us being incompatible take anything away from them - their computer was scrap anyway. If there's a problem, well they can go to the computer junkyard and exchange their scrap computer for a different old scrap computer that has compatible parts. Ted From Peter_Losher at isc.org Tue Mar 4 06:17:52 2008 From: Peter_Losher at isc.org (Peter Losher) Date: Tue Mar 4 06:17:56 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: References: Message-ID: <47CCE982.4060201@isc.org> Ted Mittelstaedt wrote: > My beef with the DNS tests was that ISC ran out and bought > the hardware FIRST, -then- they started testing. This is > directly contrary to every bit of advice ever given in > the computer industry for the last 50 years - you select > the software FIRST, -then- you buy the hardware that runs it. > In short, it said far more about the incompetence of the > testers than the shortcomings of the software. This is ridiculous. ISC is one of the most fervent pro-FreeBSD companies out there (basing most of our services on the OS, and contributing to the FreeBSD community including the busiest CVSup & FTP servers and have FreeBSD committers on staff) I will not stand back and watch folks on a public mailing list call us incompetent individuals with a anti-FreeBSD bias. First off the final report was published last Friday at: http://www.isc.org/pubs/tn/index.pl?tn=isc-tn-2008-1.html (the server this is served from runs FreeBSD) I was not one of the direct testers (we had a couple PhD's handling that, who I know both use FreeBSD on their personal systems), but as one of the folks who supported them in their work, I can tell you that the stats we gave the FreeBSD folks were from a test sponsored by the US National Science Foundation. We were mandated to use branded HW and we tested several models from HP, Sun, even Iron Systems (whitebox) before deciding on the HP's. The mechanism we used are all documented in the paper We were also asked to test DNS performance on several OS's. The short version was 'take a standard commercial off the shelf' server and see how BIND performs (esp. with DNSSEC) on it. We weren't asked to get hardware that was perfect for Brand X OS; that wasn't part of the remit. (We actually use the exact same HP HW for a secondary service where we host a couple of thousand zones using BIND including 30+ TLD zones. Oh and it runs FreeBSD) Yes we found FreeBSD performed poorly in our initial tests. and I talked to several folks (including rwatson and kris) about the issue. Kris had already been working on improving performance with MySQL and PgSQL and was interested in doing the same with BIND. Kris went off and hacked away and right before EuroBSDcon last September asked us to re-run the tests (on the same HW) using a 7.0-CURRENT snapshot, and the end results are shown with a 33,000 query increase over 6.2-RELEASE, bring FreeBSD just behind the Linux distros we tested. I know rwatson and kris have continually worked on the relevent network stack issues that cover BIND, and additional performance gains have been found since then, and working on this issue has been a true partnership between the FreeBSD developers and ISC. BIND isn't perfect, we admit that, we have been constantly improving it's multi-CPU performance and BIND 9.4 and 9.5 are continuing in that effort. We have several members of our dev team who use FreeBSD as their developent platform, including a FreeBSD committer. So Ted, stop spouting this "ISC is spewing anti-FreeBSD bias" crap, it flatly isn't true... Oh, and this email is coming to you via several of ISC FreeBSD MX servers which resolve the freebsd.org name via caching DNS servers running FreeBSD, to freebsd.org's MX server over a IPv6 tunnel supplied by ISC to the FreeBSD project to help FreeBSD eat their own IPv6 dog food... Yeah, ISC just hates FreeBSD... Best Wishes - Peter -- Peter_Losher@isc.org | ISC | OpenPGP 0xE8048D08 | "The bits must flow" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 194 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080304/110fef54/signature.pgp From tk at webmatic.de Tue Mar 4 06:53:58 2008 From: tk at webmatic.de (Thomas Krause (Webmatic)) Date: Tue Mar 4 06:54:04 2008 Subject: PCIe vs PCI In-Reply-To: <200803031642.QAA05460@sopwith.solgatos.com> References: <200803031642.QAA05460@sopwith.solgatos.com> Message-ID: <47CCF188.8060101@webmatic.de> Dieter schrieb: >> My question is: which other PCI-Express GBit NIC's then Intel's are >> available on the market? I can't find others ... > > "man -k pcie" on 6.2 gives: > > bce(4) - Broadcom NetXtreme II (BCM5706/BCM5708) PCI/PCIe Gigabit Ethernet adapter driver > re(4) - RealTek 8139C+/8169/816xS/811xS/8101E PCI/PCIe Ethernet adapter driver > > There might be more in 7.0 but it is still downloading. :-( > And there may be PCIe devices that don't show up in man -k. > Is there a way to tell from dmesg or pciconf that something is > PCIe rather than PCI? The onboard stuff could be either. I don't speak from onboard cards! I speak from PCI-E cards you can plug into the PCI-E slot. E.g. in Ingram Micros online shop there are only Intel PCI-E-Cards listed. I know there are good PCI-E NIC's from Sysconnect, but I can only buy them from special dealers. Compared to PCI-NIC's - I realy don't have a big choice of PCI-E NIC's. Regards, Thomas. From chrcoluk at gmail.com Tue Mar 4 12:19:11 2008 From: chrcoluk at gmail.com (Chris) Date: Tue Mar 4 12:19:22 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: References: <3aaaa3a0802290744x25a81d68vf0ff101f6b7a819e@mail.gmail.com> Message-ID: <3aaaa3a0803040419w25c5ac6mc4b8b6faf4f6281a@mail.gmail.com> On 29/02/2008, Ted Mittelstaedt wrote: > > > > Device drivers and hardware are a cooperative effort. The ideal > is a well-written device driver and well-designed hardware. > Unfortunately the reality of it appears to be that it costs > a LOT more money to hire good silicon designers than it costs > to hire good programmers - so a depressing amount of computer > hardware out there is very poor hardware, but the hardware's > shortcomings are made up by almost Herculean efforts of the > software developers. > > I should have thought the invention of the Winmodem (windows-only > modem) would have made this obvious to the general public > years ago. > > Unfortunately, the hardware vendors make a lot of effort to > conceal the crappiness of their designs and most customers > just care if the device works, they don't care if the only > way the device can work is if 60% of their system's CPU is > tied up servicing a device driver that is making up for > hardware shortcomings, so it is still rather difficult > for a customer to become informed about what is good and > what isn't - other than trial and error. > > I hardly think that the example I cited - the 3com 3c905 PCI > network adapter - is an example of poor support in FreeBSD. > The FreeBSD driver for the 509 worked perfectly well when > the 309 used a Lucent-built ASIC. When 3com decided to > save 50 cents a card by switching to Broadcom for the > ASIC manufacturing, the FreeBSD driver didn't work very > well with those cards - nor did the Linux driver for that > matter. This clearly wasn't a driver problem it was a > problem with Broadcom not following 3com's design specs > properly. 3com did the only thing they could - which > was to put a hack into the Windows driver - but of course, > nobody bothered telling the Linux or FreeBSD community > about it, we had to find out by dicking around with the > driver code. > > If datacenters want to purchase poor hardware and run their > stuff on it, that's their choice. Just because a piece > of hardware is "mainstream" doesen't mean it's good. It > mainly means it's inexpensive. > > Ted > Ted I never meant mainstream = good but I did mean mainstream cannot be ignored and written off if something is mainstream it is for a reason if the hardware was so poor then I am sure complaints would be so high it would no longer be mainstream. Not sure if you understanding me I am most defenitly not saying I expect a cheap network card to perform on par with a premium card. I am merely saying ideally it should perform and be as stable as it is in other operating systems and if it isnt then look at what can be improved rather than just saying go buy a new peice of kit. Is freebsd a operating system for use on premium hardware only? as that what it feels like I am reading sometimes. Now on the bind tests if the hardware used on both linux and freebsd was the exact same spec hardware then blaming the hardware is invalid as its apple vs apple. Obviously if the linux tests were done on superior hardware then its apple vs orange and the tests are invalidated. Chris From alan.bryan at yahoo.com Tue Mar 4 18:17:04 2008 From: alan.bryan at yahoo.com (alan bryan) Date: Tue Mar 4 18:17:08 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance Message-ID: <497790.39526.qm@web50507.mail.re2.yahoo.com> Hi, I've got a new server with a 3ware 9550SXU with the Battery. I am using FreeBSD 7.0-Release (tried both 4BSD and ULE) using AMD64 and the 3ware performance for writes is just plain horrible. Something is obviously wrong but I'm not sure what. I've got a 4 disk RAID 10 array. According to 3dm2 the cache is on. I even tried setting The StorSave preference to "Performance" with no real benefit. There seems to be something really wrong with disk performance. Here's the results from bonnie: File './Bonnie.2551', size: 104857600 Writing with putc()...done Rewriting...done Writing intelligently...done Reading with getc()...done Reading intelligently...done Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done... -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 100 9989 4.8 6739 1.0 18900 7.8 225973 98.5 1914662 99.9 177210.7 259.7 Any ideas? Anybody have one of these that's working with FreeBSD 7? ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From lamont at cluepon.com Tue Mar 4 19:15:03 2008 From: lamont at cluepon.com (Lamont Lucas) Date: Tue Mar 4 19:15:08 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <497790.39526.qm@web50507.mail.re2.yahoo.com> References: <497790.39526.qm@web50507.mail.re2.yahoo.com> Message-ID: <47CD9954.5040105@cluepon.com> alan bryan wrote: > Hi, > > I've got a new server with a 3ware 9550SXU with the > Battery. I am using FreeBSD 7.0-Release (tried both > 4BSD and ULE) using AMD64 and the 3ware performance > for writes is just plain horrible. Something is > obviously wrong but I'm not sure what. Hello Alan. Have you ever used this card with any other OS? I have about 6 of them running under linux and I too get horrible write performance from them. After several days of tuning and research, I've concluded that the cards are just stinkers reguardless of what OS they are running. That seems to be the consensus if you google 3ware write performance. I've put them in 64 and 32 bit slots, various speeds and with various drives. I had an older 3ware card running under FreeBSD 6.2 that would give decent read performance in a raid0 configuration but terrible write performance. I performed no tuning on that setup, as my application was mostly read and it was "good enough" for the time. Not terribly helpful, I admit, but I wanted you to at least think about it being the card rather than the OS. From mike at sentex.net Tue Mar 4 19:24:29 2008 From: mike at sentex.net (Mike Tancsa) Date: Tue Mar 4 19:24:33 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <497790.39526.qm@web50507.mail.re2.yahoo.com> References: <497790.39526.qm@web50507.mail.re2.yahoo.com> Message-ID: <200803041924.m24JOOt0096588@lava.sentex.ca> At 12:50 PM 3/4/2008, alan bryan wrote: >Hi, > >I've got a new server with a 3ware 9550SXU with the >Battery. I am using FreeBSD 7.0-Release (tried both >4BSD and ULE) using AMD64 and the 3ware performance >for writes is just plain horrible. Something is >obviously wrong but I'm not sure what. Not sure about 7.0, but I have this card on a 6.3 box. Doing something simple like % cat /dev/zero > big % iostat -c 100 tty da0 pass0 cpu tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 0 53.92 6 0.31 0.00 0 0.00 3 0 0 0 96 3 197 4.00 1 0.00 0.00 0 0.00 0 0 0 0 100 1 64 127.94 1100 137.42 0.00 0 0.00 0 0 49 5 46 0 63 128.00 1012 126.50 0.00 0 0.00 2 0 34 5 59 0 63 128.00 969 121.13 0.00 0 0.00 1 0 30 2 66 0 62 127.85 758 94.67 0.00 0 0.00 1 0 26 4 69 0 61 127.82 1252 156.25 0.00 0 0.00 1 0 50 8 42 0 63 127.59 542 67.59 0.00 0 0.00 1 0 26 1 72 0 61 127.66 1026 127.90 0.00 0 0.00 0 0 47 5 49 1 87 127.56 513 63.97 0.00 0 0.00 0 0 19 4 77 0 61 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100 Shows pretty OK write performance. % dd if=/dev/zero of=/var/tmp/big bs=1024k count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 8.646134 secs (121276863 bytes/sec) This is on RAID10 on 4 Segate ST380811AS drives. 3ware device driver for 9000 series storage controllers, version: 3.60.04.003 twa0: <3ware 9000 series Storage Controller> port 0xac00-0xac3f mem 0xf4000000-0xf5ffffff,0xff2ff000-0xff2fffff irq 16 at device 3.0 on pci2 twa0: [GIANT-LOCKED] twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-4LP, 4 ports, Firmware FE9X 3.01.01.028, BIOS BE9X 3.01.00.024 da0: Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 152566MB (312455168 512 byte sectors: 255H 63S/T 19449C) I have write cache enabled and performance set for the StorSav. Its in a PCI-x slot as well. ---Mike From bsam at ipt.ru Tue Mar 4 21:03:30 2008 From: bsam at ipt.ru (Boris Samorodov) Date: Tue Mar 4 21:03:40 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <497790.39526.qm@web50507.mail.re2.yahoo.com> (alan bryan's message of "Tue\, 4 Mar 2008 09\:50\:23 -0800 \(PST\)") References: <497790.39526.qm@web50507.mail.re2.yahoo.com> Message-ID: <17941875@bb.ipt.ru> On Tue, 4 Mar 2008 09:50:23 -0800 (PST) alan bryan wrote: > I've got a new server with a 3ware 9550SXU with the > Battery. I am using FreeBSD 7.0-Release (tried both > 4BSD and ULE) using AMD64 and the 3ware performance > for writes is just plain horrible. Something is > obviously wrong but I'm not sure what. > I've got a 4 disk RAID 10 array. > According to 3dm2 the cache is on. I even tried > setting The StorSave preference to "Performance" with > no real benefit. There seems to be something really > wrong with disk performance. Here's the results from > bonnie: > File './Bonnie.2551', size: 104857600 > Writing with putc()...done > Rewriting...done > Writing intelligently...done > Reading with getc()...done > Reading intelligently...done > Seeker 1...Seeker 2...Seeker 3...start > 'em...done...done...done... > -------Sequential Output-------- > ---Sequential Input-- --Random-- > -Per Char- --Block--- -Rewrite-- -Per > Char- --Block--- --Seeks--- > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec > %CPU K/sec %CPU /sec %CPU > 100 9989 4.8 6739 1.0 18900 7.8 225973 > 98.5 1914662 > 99.9 177210.7 259.7 > Any ideas? Anybody have one of these that's working > with FreeBSD 7? I had almost the same problem. The transfer rate was less that 10Mb/s for raid-10. When a BBU arrived I thought my problems gone. Not that at once. Firstly none changed. I tested many combinatins with no avail (it lasted some hours). Then it so happed that I set StorSave "Ballanced", cache "on" and marked (by an accident) the raid for checking (or verifying... can't recall how it's named at the 3WARE BIOS). The test has began, I stopped it. Rebooted the OS. To my surprise after booting the transfer rate was quite good. BTW, the test proceeded after booting... Maybe the card did not use the cache while the battery was uncharged? Don't know (but noticed "twa0: INFO: (0x04: 0x0056): Battery charging completed:" at dmesg). Here it is now with WD RE2 1T disks: ----- 3ware device driver for 9000 series storage controllers, version: 3.70.05.001 twa0: <3ware 9000 series Storage Controller> port 0xc800-0xc8ff mem 0xf8000000-0xf9ffffff,0xfeaff000-0xfeafffff irq 16 at device 0.0 on pci4 twa0: [ITHREAD] twa0: INFO: (0x04: 0x0029): Verify started: unit=0 twa0: INFO: (0x04: 0x003D): Verify paused: unit=0 twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8 ports, Firmware FE9X 3.08.02.005, BIOS BE9X 3.08.00.002 ----- da0 at twa0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 100.000MB/s transfers da0: 953632MB (1953038336 512 byte sectors: 255H 63S/T 121571C) ----- /dev/da0s1d 902G 36G 794G 4% /space ----- btest% uname -a FreeBSD btest.ipt.ru 7.0-RELEASE FreeBSD 7.0-RELEASE #14: Mon Mar 3 18:27:26 MSK 2008 root@btest.ipt.ru:/usr/obj/usr/src/sys/BTEST i386 btest% sudo dd if=/dev/da0 of=/dev/null bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 5.073393 secs (206681412 bytes/sec) btest% sudo dd if=/dev/zero of=/space/dd.file bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 5.670188 secs (184927909 bytes/sec) ----- WBR -- Boris Samorodov (bsam) Research Engineer, http://www.ipt.ru Telephone & Internet SP FreeBSD committer, http://www.FreeBSD.org The Power To Serve From bsam at ipt.ru Tue Mar 4 21:03:31 2008 From: bsam at ipt.ru (Boris Samorodov) Date: Tue Mar 4 21:03:41 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <497790.39526.qm@web50507.mail.re2.yahoo.com> (alan bryan's message of "Tue\, 4 Mar 2008 09\:50\:23 -0800 \(PST\)") References: <497790.39526.qm@web50507.mail.re2.yahoo.com> Message-ID: <50101940@bb.ipt.ru> On Tue, 4 Mar 2008 09:50:23 -0800 (PST) alan bryan wrote: > I've got a new server with a 3ware 9550SXU with the > Battery. I am using FreeBSD 7.0-Release (tried both > 4BSD and ULE) using AMD64 and the 3ware performance > for writes is just plain horrible. Something is > obviously wrong but I'm not sure what. > I've got a 4 disk RAID 10 array. > According to 3dm2 the cache is on. I even tried > setting The StorSave preference to "Performance" with > no real benefit. There seems to be something really > wrong with disk performance. Here's the results from > bonnie: > File './Bonnie.2551', size: 104857600 > Writing with putc()...done > Rewriting...done > Writing intelligently...done > Reading with getc()...done > Reading intelligently...done > Seeker 1...Seeker 2...Seeker 3...start > 'em...done...done...done... > -------Sequential Output-------- > ---Sequential Input-- --Random-- > -Per Char- --Block--- -Rewrite-- -Per > Char- --Block--- --Seeks--- > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec > %CPU K/sec %CPU /sec %CPU > 100 9989 4.8 6739 1.0 18900 7.8 225973 > 98.5 1914662 > 99.9 177210.7 259.7 > Any ideas? Anybody have one of these that's working > with FreeBSD 7? I had almost the same problem. The transfer rate was less that 10Mb/s for raid-10. When a BBU arrived I thought my problems gone. Not that at once. Firstly none changed. I tested many combinatins with no avail (it lasted some hours). Then it so happed that I set StorSave "Ballanced", cache "on" and marked (by an accident) the raid for checking (or verifying... can't recall how it's named at the 3WARE BIOS). The test has began, I stopped it. Rebooted the OS. To my surprise after booting the transfer rate was quite good. BTW, the test proceeded after booting... Maybe the card did not use the cache while the battery was uncharged? Don't know (but noticed "twa0: INFO: (0x04: 0x0056): Battery charging completed:" at dmesg). Here it is now with WD RE2 1T disks: ----- 3ware device driver for 9000 series storage controllers, version: 3.70.05.001 twa0: <3ware 9000 series Storage Controller> port 0xc800-0xc8ff mem 0xf8000000-0xf9ffffff,0xfeaff000-0xfeafffff irq 16 at device 0.0 on pci4 twa0: [ITHREAD] twa0: INFO: (0x04: 0x0029): Verify started: unit=0 twa0: INFO: (0x04: 0x003D): Verify paused: unit=0 twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8 ports, Firmware FE9X 3.08.02.005, BIOS BE9X 3.08.00.002 ----- da0 at twa0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 100.000MB/s transfers da0: 953632MB (1953038336 512 byte sectors: 255H 63S/T 121571C) ----- /dev/da0s1d 902G 36G 794G 4% /space ----- btest% uname -a FreeBSD btest.ipt.ru 7.0-RELEASE FreeBSD 7.0-RELEASE #14: Mon Mar 3 18:27:26 MSK 2008 root@btest.ipt.ru:/usr/obj/usr/src/sys/BTEST i386 btest% sudo dd if=/dev/da0 of=/dev/null bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 5.073393 secs (206681412 bytes/sec) btest% sudo dd if=/dev/zero of=/space/dd.file bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 5.670188 secs (184927909 bytes/sec) ----- WBR -- Boris Samorodov (bsam) Research Engineer, http://www.ipt.ru Telephone & Internet SP FreeBSD committer, http://www.FreeBSD.org The Power To Serve From bsam at ipt.ru Tue Mar 4 21:08:29 2008 From: bsam at ipt.ru (Boris Samorodov) Date: Tue Mar 4 21:08:32 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <17941875@bb.ipt.ru> (Boris Samorodov's message of "Tue\, 04 Mar 2008 23\:25\:00 +0300") References: <497790.39526.qm@web50507.mail.re2.yahoo.com> <17941875@bb.ipt.ru> Message-ID: <42900454@bb.ipt.ru> On Tue, 04 Mar 2008 23:25:00 +0300 Boris Samorodov wrote: > Here it is now with WD RE2 1T disks: Sorry, these are just plain WD-JS 0.5T disks here. WBR -- Boris Samorodov (bsam) Research Engineer, http://www.ipt.ru Telephone & Internet SP FreeBSD committer, http://www.FreeBSD.org The Power To Serve From alan.bryan at yahoo.com Wed Mar 5 18:09:11 2008 From: alan.bryan at yahoo.com (alan bryan) Date: Wed Mar 5 18:09:15 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <497790.39526.qm@web50507.mail.re2.yahoo.com> Message-ID: <96374.85181.qm@web50511.mail.re2.yahoo.com> --- alan bryan wrote: > Hi, > > I've got a new server with a 3ware 9550SXU with the > Battery. I am using FreeBSD 7.0-Release (tried both > 4BSD and ULE) using AMD64 and the 3ware performance > for writes is just plain horrible. Something is > obviously wrong but I'm not sure what. > > I've got a 4 disk RAID 10 array. > > According to 3dm2 the cache is on. I even tried > setting The StorSave preference to "Performance" > with > no real benefit. There seems to be something really > wrong with disk performance. Here's the results > from > bonnie: OK - so, I ran the server for about 24 hrs while it did its battery test. After that it automatically turned on it's write cache. So, even though initially 3dm2 was reporting the cache was on I guess it must not have been. I ran some more tests: Version 1.93d ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP server 16G 530 99 120160 23 28207 8 903 99 109688 16 518.7 18 Latency 18651us 541ms 447ms 11461us 271ms 66793us Version 1.93d ------Sequential Create------ --------Random Create-------- server -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 20737 34 +++++ +++ +++++ +++ 20429 37 +++++ +++ +++++ +++ Latency 146ms 22us 101ms 150ms 24us 31us 1.93c,1.93d,server,1,1204745494,16G,,530,99,120160,23,28207,8,903,99,109688,16,518.7,18,16,,,,,20737,34,+++++,+++,+++++,+++,20429,37,+++++,+++,+++++,+++,18651us,541ms,447ms,11461us,271ms,66793us,146ms,22us,101ms,150ms,24us,31us And my original test again: Writing with putc()...done Rewriting...done Writing intelligently...done Reading with getc()...done Reading intelligently...done Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done... -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 100 154436 71.4 86016 10.8 104301 13.7 226587 100.0 1934776 101.6 173626.2 261.3 >dd if=/dev/zero of=/tmp/dd.file bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 7.814102 secs (134190213 bytes/sec) So, it's better but am I still getting what I should be seeing? ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping From mav at mavhome.dp.ua Wed Mar 5 20:56:25 2008 From: mav at mavhome.dp.ua (Alexander Motin) Date: Wed Mar 5 21:52:28 2008 Subject: Memory allocation performance In-Reply-To: <1202113382.00019874.1202101201@10.7.7.3> References: <1201839789.00018590.1201827602@10.7.7.3> <1201836184.00018598.1201823403@10.7.7.3> <1201868581.00018705.1201855203@10.7.7.3> <1201904582.00018976.1201893001@10.7.7.3> <1201965806.00019169.1201955401@10.7.7.3> <1201958583.00019173.1201947005@10.7.7.3> <1202001786.00019395.1201991402@10.7.7.3> <1202005381.00019409.1201992602@10.7.7.3> <1202095383.00019854.1202083801@10.7.7.3> <1202113382.00019874.1202101201@10.7.7.3> Message-ID: <47CEFAE0.9000402@mavhome.dp.ua> Bruce Evans wrote: > Try profiling it one another type of CPU, to get different performance > counters but hopefully not very different stalls. If the other CPU doesn't > stall at all, put another black mark against P4 and delete your copies of > it :-). I have tried to profile the same system with the same load on different hardware: - was Pentium4 2.8 at ASUS MB based on i875G chipset, - now PentiumD 3.0 at Supermicro PDSMi board based on E7230 chipset. The results are completely different. The problem has gone: 0.03 0.04 538550/2154375 ip_forward [11] 0.03 0.04 538562/2154375 em_get_buf [32] 0.07 0.08 1077100/2154375 ng_package_data [26] [15]1.8 0.14 0.15 2154375 uma_zalloc_arg [15] 0.06 0.00 1077151/3232111 generic_bzero [22] 0.03 0.00 538555/538555 mb_ctor_mbuf [60] 0.03 0.00 2154375/4421407 critical_exit [63] 0.02 0.01 538554/2154376 m_freem [42] 0.02 0.01 538563/2154376 mb_free_ext [54] 0.04 0.03 1077100/2154376 ng_free_item [48] [30]0.9 0.08 0.06 2154376 uma_zfree_arg [30] 0.03 0.00 2154376/4421407 critical_exit [63] 0.00 0.01 538563/538563 mb_dtor_pack [82] 0.01 0.00 2154376/4421971 critical_enter [69] So probably it was some hardware related problem. First MB has video integrated to chipset without any dedicated memory, possibly it affected memory performance in some way. On the first system there were such messages on boot: Mar 3 23:01:20 swamp kernel: acpi0: reservation of 0, a0000 (3) failed Mar 3 23:01:20 swamp kernel: acpi0: reservation of 100000, 3fdf0000 (3) failed Mar 3 23:01:20 swamp kernel: agp0: on vgapci0 Mar 3 23:01:20 swamp kernel: agp0: detected 892k stolen memory Mar 3 23:01:20 swamp kernel: agp0: aperture size is 128M , can they be related? -- Alexander Motin From linimon at lonesome.com Wed Mar 5 23:04:51 2008 From: linimon at lonesome.com (Mark Linimon) Date: Thu Mar 6 01:33:53 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: Message-ID: <20080305223151.GA8626@soaustin.net> > > * I am trying to understand what is different about the ISC > > configuration but have not yet found the cause. > It's called "Anti-FreeBSD bias". You won't find anything. If this is true, please try to explain to me the following: - ISC hosts 5 Netra 1s that comprise most of our sparc64 package build cluster. They are allowing us to add 4 more next week. - ISC hosts 3 amd64 machines for our amd64 package build cluster. - ISC used to host 3 alpha machines, until we retired them. - ISC hosts ftp4.freebsd.org, which is one of the 2 machines that the address ftp.freebsd.org rotors to. This is an extremely high- bandwidth machine. - ISC hosts several other development machines (I am not aware of all the exact ones). All of this has been in place for years, with the space, power, and cooling all donated for free. Kris and others have been doing a tremendous amount of work over the past 2 years to identify and fix performance problems in FreeBSD. There have been literally hundreds of regression tests run, resulting in a large number of cycles of commit/test. Sometimes the commits do what we expect, sometimes no. Lather, rinse, repeat. The difference in performance between 6.3R and 7.0R is primarily due to all this effort. ISC's re-tests seems to confirm the improvements. The current speculation is that the difference in the measurements we're seeing could well be due to our drivers. If so, let's identify and fix the problems. Otherwise, let's try to understand whether there are any meaningful differences in the way the tests are being run. Casting aspersions on someone's methodology or motives just because you (or I) don't like the results is merely nonsense. AFAICT ISC's business model primarily consists of them selling the ability of bind to perform under load. That's the variable they have to optimize for. Let's hope that we are part of helping them to do just that. mcl From alan.bryan at yahoo.com Thu Mar 6 02:07:38 2008 From: alan.bryan at yahoo.com (alan bryan) Date: Thu Mar 6 02:07:43 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <96374.85181.qm@web50511.mail.re2.yahoo.com> Message-ID: <857138.52972.qm@web50511.mail.re2.yahoo.com> --- alan bryan wrote: > > --- alan bryan wrote: > > > Hi, > > > > I've got a new server with a 3ware 9550SXU with > the > > Battery. I am using FreeBSD 7.0-Release (tried > both > > 4BSD and ULE) using AMD64 and the 3ware > performance > > for writes is just plain horrible. Something is > > obviously wrong but I'm not sure what. > > > > I've got a 4 disk RAID 10 array. > > > > According to 3dm2 the cache is on. I even tried > > setting The StorSave preference to "Performance" > > with > > no real benefit. There seems to be something > really > > wrong with disk performance. So, all of this seems to be due to the Battery unit. What seems to be happening is that if the battery isn't fully charged the write cache get disabled. So, when I was testing and removing the card, changing cabling, etc... with the server powered off the battery would get ever so slightly discharged and then when booting back up the server has the cache off until it gets topped back up again. Thus, I was seeing weird inconsistent results all over the place. I had to move the BBU and disconnected it for a few minutes and then turned the server back on. When it came back up 3dm2 reporting that the BBU was charging and that write cache was off. Once recharged (15 min or so) the cache was turned back on. This could be really bad in a loaded server if you have to down it for a few min service window and then boot back up and the write cache is then off. I found that I could log into 3dm2 and change the StorSave profile to "Performance" which would then enable the cache again (it ignores whether you have the BBU). Then once the battery is charged you can go back to "Balanced" profile. Guess the lesson learned is make sure the server has been powered up for 24hrs and battery fully charged before doing any tests against this 3ware card. ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From ivoras at freebsd.org Thu Mar 6 09:54:03 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Mar 6 09:54:08 2008 Subject: 7.0-Release and 3ware 9550SXU w/BBU - horrible write performance In-Reply-To: <96374.85181.qm@web50511.mail.re2.yahoo.com> References: <497790.39526.qm@web50507.mail.re2.yahoo.com> <96374.85181.qm@web50511.mail.re2.yahoo.com> Message-ID: alan bryan wrote: > --- alan bryan wrote: >> I've got a 4 disk RAID 10 array. > Version 1.93d ------Sequential Output------ > --Sequential Input- --Random- > Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per > Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP > K/sec %CP K/sec %CP /sec %CP > server 16G 530 99 120160 23 28207 8 903 99 > 109688 16 518.7 18 > So, it's better but am I still getting what I should > be seeing? It seems about right - twice the performance of a single drive is OK for a 4-drive RAID10. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080306/1b42fb92/signature.pgp From tedm at toybox.placo.com Fri Mar 7 12:11:22 2008 From: tedm at toybox.placo.com (Ted Mittelstaedt) Date: Fri Mar 7 12:11:26 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <47CCE982.4060201@isc.org> Message-ID: > -----Original Message----- > From: Peter Losher [mailto:Peter_Losher@isc.org] > Sent: Monday, March 03, 2008 10:18 PM > To: Ted Mittelstaedt > Cc: freebsd-performance@freebsd.org; freebsd-questions@freebsd.org > Subject: Re: FreeBSD bind performance in FreeBSD 7 > > > Yeah, ISC just hates FreeBSD... This final report here: ftp://ftp.isc.org/isc/dns_perf/ISC-TN-2008-1.pdf is LIGHTYEARS different than the draft here: http://new.isc.org/proj/dnsperf/OStest.html The draft contains the conclusion: "...We will use Linux Gentoo 2.6.20.7 for further production testing. We brought these numbers to the attention of the FreeBSD development team, and will re-test when FreeBSD 7.1 is released..." This is completely missing in the final. Added is a bunch of praise of bind on commodity hardware. And also added is the line: "...All computers in the testbed were loaded with identical copies of FreeBSD 7.0-RELEASE..." which is missing in the draft. So in other words, it certainly appears that the final is 180 degrees opposite of it's discussion of FreeBSD. The draft appears to suggest to avoid it - the final appears to suggest to embrace it. So what, exactly may I ask, were you expecting after writing that draft? Everyone here to be happy? It almost seems to me like the draft was a trial balloon floated to get the FreeBSD developers to jump in and do some coding for you at the last minute. But, I'll say no more about that and turn towards the report - because it has some significant problems. I'll start with the beginning: "...We have been particularly interested in the performance of DNSSEC mechanisms within the DNS protocols and on the use of BIND as a production server for major zones..." OK, fine and good. However, the conclusion is rather different: "...Commodity hardware with BIND 9 software is fast enough by a wide margin to run large production name service..." What is going on here? This project started out as purely observational - merely interested in BIND performance - and ended up being a proof for the hypothesis that BIND is good enough to run large nameservers on commodity hardware. In short, the report is moving from an objective view to a subjective goal of proving BIND is kick-ass. It is interesting how the original draft conclusion IS NOT subjective with regards to BIND (it is with regards to FreeBSD of course) and uses the phrase "further production testing" implying that BIND is still under development, while the final report uses the language: "...open-source software and commodity hardware are up to the task of providing full-scale production name..." which definitely implies that BIND is "done" and ready for production. Another thing of interest concernes the OS. Microsoft Windows 2003 server is included in the first breaking point test. It is absent from the other tests. And the version chosen is old, old, it is NOT even Server 2003 R2, nor the RC of Server 2008 which is available. Why were the Windows test results even left in the published report at all? What purpose do they serve other than as a feel-good "bash Windows". If you really were interested in the results of testing, you would have wanted to know how BIND did under Windows for the other tests. But, as I pointed out, by the time the later tests were run the goal has stopped being the pure objective observational goal, and become the subjective "prove BIND is the best" goal. And as the Windows results for the breaking test were so low, it was an embarassment to keep bothering with it, so it was dropped. The report also suffers from NOT listing out the components of the HP servers and instead offering a link to HP. Yeah, how long is that link going to be valid? HP changes it's website and changes it's product line up as often as I change my underpants - a year from now, that product will be gone and a new reader will have a snowball's chance in Hell of getting the actual server specs, and I mean the chipsets in use for the disk controller, nic card, video, etc. You know, the stuff that actually -affects- the performance of different operating systems. But the biggest hole is the report conclusion and this shift from objective, to subjective, reporting. The conclusion claims BIND is great on commodity hardware but what it ONLY has proven is that BIND is great on this one specific hardware platform running a couple specific operating systems. If you really wanted to merely objectively observe BIND on commodity hardware you should have had your testers stay out of the setup of the OS and platform. You should have called up the developers of the various operating systems you were going to use - Microsoft among them - and told them to each send in a group that would build a server to their spec. You should have merely set a maximum limit that the server could cost that was in line with commodity server hardware costs - something like $2K and it had to be name-brand, for example - and let all of the vested interest groups do their best to create a server that would run as fast as they could in those constraints. In short, if the testers are setting out to prove BIND is really powerful, they are essentially trying to write a benchmark. And the way you do that is by deliberatly pulling all the stops out to make your stuff run as lickety-split as possible - then you document the crap out of everything you did to make it run lickety-split, so that anyone else can come along, set up the stuff the same way you did, and then get the same results. Benchmarks are subjective and they are expected to be subjective - but when you write them, your admitting your testers are being subjective. In that case, there is no point in having an OS bake-off since your going to have your testers select the OS that will give the best shine to your product. The report needs to make up it's mind what it's actually trying to accomplish, objective, or subjective, reporting. Ted From ender at enderzone.com Fri Mar 7 16:55:08 2008 From: ender at enderzone.com (Simon Dircks) Date: Fri Mar 7 16:55:37 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: References: Message-ID: <47D16CDA.9090200@enderzone.com> Ted Mittelstaedt wrote: > >> -----Original Message----- >> From: Peter Losher [mailto:Peter_Losher@isc.org] >> Sent: Monday, March 03, 2008 10:18 PM >> To: Ted Mittelstaedt >> Cc: freebsd-performance@freebsd.org; freebsd-questions@freebsd.org >> Subject: Re: FreeBSD bind performance in FreeBSD 7 >> >> >> Yeah, ISC just hates FreeBSD... >> > > This final report here: > > ftp://ftp.isc.org/isc/dns_perf/ISC-TN-2008-1.pdf > > is LIGHTYEARS different than the draft here: > > http://new.isc.org/proj/dnsperf/OStest.html > > > The draft contains the conclusion: > > You change your underpants once a year? From tedm at toybox.placo.com Sat Mar 8 18:32:32 2008 From: tedm at toybox.placo.com (Ted Mittelstaedt) Date: Sat Mar 8 18:32:43 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: <47D16CDA.9090200@enderzone.com> Message-ID: > -----Original Message----- > From: Simon Dircks [mailto:ender@enderzone.com] > Sent: Friday, March 07, 2008 8:27 AM > To: Ted Mittelstaedt > Cc: Peter Losher; freebsd-performance@freebsd.org; > freebsd-questions@freebsd.org > Subject: Re: FreeBSD bind performance in FreeBSD 7 > > > Ted Mittelstaedt wrote: > > > >> -----Original Message----- > >> From: Peter Losher [mailto:Peter_Losher@isc.org] > >> Sent: Monday, March 03, 2008 10:18 PM > >> To: Ted Mittelstaedt > >> Cc: freebsd-performance@freebsd.org; freebsd-questions@freebsd.org > >> Subject: Re: FreeBSD bind performance in FreeBSD 7 > >> > >> > >> Yeah, ISC just hates FreeBSD... > >> > > > > This final report here: > > > > ftp://ftp.isc.org/isc/dns_perf/ISC-TN-2008-1.pdf > > > > is LIGHTYEARS different than the draft here: > > > > http://new.isc.org/proj/dnsperf/OStest.html > > > > > > The draft contains the conclusion: > > > > > > You change your underpants once a year? > I just throw them against the wall - if they stick, it's time for a change. Seriously if you think HP only changes it's product lineup once a year you haven't bought much HP. It's a very common occurance for us to make up a quote for a new HP server then by the time the customer signs off on it and we are able to go order the server, we find it on the "constrained" list because they are replacing it with yet another model change. Ted From killing at multiplay.co.uk Sat Mar 8 18:57:42 2008 From: killing at multiplay.co.uk (Steven Hartland) Date: Sat Mar 8 18:57:46 2008 Subject: rrdtool / mtr causing stalling on 7.0 Message-ID: <056601c8814c$516c0370$b6db87d4@multiplay.co.uk> We've been suffering on our stats box for some time now where by the machine will just stall for several seconds preventing everything from tab completion to vi newfile.txt. I was hoping an upgrade to 7.0 and ULE may help the situation but unfortunately it hasn't. I've attached both dmesg and output from lock profiling during a 5 minute period where I know the stall happened at least once. Any advice / pointers would be gratefully received. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From noc at hdk5.net Sat Mar 8 20:40:51 2008 From: noc at hdk5.net (Al Plant) Date: Sat Mar 8 21:01:06 2008 Subject: FreeBSD bind performance in FreeBSD 7 In-Reply-To: References: Message-ID: <47D2F59C.2040806@hdk5.net> Ted Mittelstaedt wrote: > >> -----Original Message----- >> From: Simon Dircks [mailto:ender@enderzone.com] >> Sent: Friday, March 07, 2008 8:27 AM >> To: Ted Mittelstaedt >> Cc: Peter Losher; freebsd-performance@freebsd.org; >> freebsd-questions@freebsd.org >> Subject: Re: FreeBSD bind performance in FreeBSD 7 >> >> >> Ted Mittelstaedt wrote: >> >>> >>> >>>> -----Original Message----- >>>> From: Peter Losher [mailto:Peter_Losher@isc.org] >>>> Sent: Monday, March 03, 2008 10:18 PM >>>> To: Ted Mittelstaedt >>>> Cc: freebsd-performance@freebsd.org; freebsd-questions@freebsd.org >>>> Subject: Re: FreeBSD bind performance in FreeBSD 7 >>>> >>>> >>>> Yeah, ISC just hates FreeBSD... >>>> >>>> >>> This final report here: >>> >>> ftp://ftp.isc.org/isc/dns_perf/ISC-TN-2008-1.pdf >>> >>> is LIGHTYEARS different than the draft here: >>> >>> http://new.isc.org/proj/dnsperf/OStest.html >>> >>> >>> The draft contains the conclusion: >>> >>> >>> >> You change your underpants once a year? >> >> > > I just throw them against the wall - if they stick, it's time > for a change. > > Seriously if you think HP only changes it's product lineup > once a year you haven't bought much HP. It's a very common > occurance for us to make up a quote for a new HP server then > by the time the customer signs off on it and we are able > to go order the server, we find it on the "constrained" list > because they are replacing it with yet another model change. > > Ted > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" > > Aloha Ted, Dell sends many of its products in a single purchase out with nic cards and other components that are not the same in every box too. ~Al Plant - Honolulu, Hawaii - Phone: 808-284-2740 + http://hawaiidakine.com + http://freebsdinfo.org + + http://aloha50.net - Supporting - FreeBSD 6.* - 7.* - 8.* + < email: noc@hdk5.net > "All that's really worth doing is what we do for others."- Lewis Carrol From rwatson at FreeBSD.org Sat Mar 8 22:23:31 2008 From: rwatson at FreeBSD.org (Robert Watson) Date: Sat Mar 8 22:23:35 2008 Subject: rrdtool / mtr causing stalling on 7.0 In-Reply-To: <056601c8814c$516c0370$b6db87d4@multiplay.co.uk> References: <056601c8814c$516c0370$b6db87d4@multiplay.co.uk> Message-ID: <20080308221441.E11432@fledge.watson.org> On Sat, 8 Mar 2008, Steven Hartland wrote: > We've been suffering on our stats box for some time now where by the machine > will just stall for several seconds preventing everything from tab > completion to vi newfile.txt. > > I was hoping an upgrade to 7.0 and ULE may help the situation but > unfortunately it hasn't. > > I've attached both dmesg and output from lock profiling during a 5 minute > period where I know the stall happened at least once. > > Any advice / pointers would be gratefully received. It looks like the attachment got lost on the way through the mailing list. I think the first starting point is: what sort of stall is this? Is it, for example, all network communication stalling, all disk I/O stalling, or the entire kernel and all processes stalling? The usual diagnostics are: - Does the machine stop responding to pings while stalled, and/or possibly "catch up" all at once when it recovers? - If you run the following loop on the machine without any network or console I/O, do you see gaps in time stamps: while (1) { sleep 1 date >> date.log } - If you write a short C program that looks a lot like the above loop, but logs time stamps into an in-memory buffer, and have it look for gaps in the sequence of >3 seconds, does it run across the stall? Robert N M Watson Computer Laboratory University of Cambridge From killing at multiplay.co.uk Sun Mar 9 01:14:17 2008 From: killing at multiplay.co.uk (Steven Hartland) Date: Sun Mar 9 01:14:21 2008 Subject: rrdtool / mtr causing stalling on 7.0 References: <056601c8814c$516c0370$b6db87d4@multiplay.co.uk> <20080308221441.E11432@fledge.watson.org> Message-ID: <006401c88181$25cf0e30$b6db87d4@multiplay.co.uk> ----- Original Message ----- From: "Robert Watson" > It looks like the attachment got lost on the way through the mailing list. > > I think the first starting point is: what sort of stall is this? Is it, for > example, all network communication stalling, all disk I/O stalling, or the > entire kernel and all processes stalling? The usual diagnostics are: > > - Does the machine stop responding to pings while stalled, and/or possibly > "catch up" all at once when it recovers? > > - If you run the following loop on the machine without any network or console > I/O, do you see gaps in time stamps: > > while (1) { > sleep 1 > date >> date.log > } > > - If you write a short C program that looks a lot like the above loop, but > logs time stamps into an in-memory buffer, and have it look for gaps in the > sequence of >3 seconds, does it run across the stall? Thanks for the ideas Robert the output from the shell script this shows significant gaps:- Sun Mar 9 00:20:33 GMT 2008 Sun Mar 9 00:20:34 GMT 2008 <== Stall Sun Mar 9 00:21:09 GMT 2008 Sun Mar 9 00:21:10 GMT 2008 ... Sun Mar 9 00:25:23 GMT 2008 Sun Mar 9 00:25:24 GMT 2008 Sun Mar 9 00:25:25 GMT 2008 Sun Mar 9 00:25:27 GMT 2008 <== Stall Sun Mar 9 00:25:53 GMT 2008 Sun Mar 9 00:25:59 GMT 2008 Sun Mar 9 00:26:00 GMT 2008 Running a ping along side shows no missed responses. Enabling lock profiling for the period changes the behaviour somewhat, producing shorter but multiple stalls. Sun Mar 9 00:30:31 GMT 2008 Sun Mar 9 00:30:32 GMT 2008 Sun Mar 9 00:30:34 GMT 2008 Sun Mar 9 00:30:35 GMT 2008 Sun Mar 9 00:30:36 GMT 2008 Sun Mar 9 00:30:37 GMT 2008 Sun Mar 9 00:30:38 GMT 2008 Sun Mar 9 00:30:41 GMT 2008 Sun Mar 9 00:30:42 GMT 2008 <== Stall Sun Mar 9 00:30:44 GMT 2008 Sun Mar 9 00:30:45 GMT 2008 <== Stall Sun Mar 9 00:30:47 GMT 2008 <== Stall Sun Mar 9 00:30:49 GMT 2008 Sun Mar 9 00:30:50 GMT 2008 <== Stall Sun Mar 9 00:30:52 GMT 2008 <== Stall Sun Mar 9 00:30:54 GMT 2008 Sun Mar 9 00:30:55 GMT 2008 Sun Mar 9 00:30:56 GMT 2008 Sun Mar 9 00:30:57 GMT 2008 <== Stall Sun Mar 9 00:31:03 GMT 2008 <== Stall Sun Mar 9 00:31:05 GMT 2008 Sun Mar 9 00:31:06 GMT 2008 <== Stall Sun Mar 9 00:31:08 GMT 2008 Sun Mar 9 00:31:09 GMT 2008 Sun Mar 9 00:31:10 GMT 2008 Sun Mar 9 00:31:11 GMT 2008 <== Stall Sun Mar 9 00:31:14 GMT 2008 Sun Mar 9 00:31:15 GMT 2008 Sun Mar 9 00:31:16 GMT 2008 <== Stall Sun Mar 9 00:31:20 GMT 2008 Sun Mar 9 00:31:21 GMT 2008 Sun Mar 9 00:31:22 GMT 2008 Using the following c code we also see stalls: #include #include #include int main( char **argv, int argc ) { time_t last = time( NULL ); while ( 1 ) { time_t now = time( NULL ); time_t diff = now - last; if ( diff >= 2 ) { fprintf( stderr, "stalled for %d seconds\n", diff ); } fprintf( stderr, ctime( &now ) ); last = now; sleep( 1 ); } exit( 0 ); } [date.log] Sun Mar 9 00:55:40 GMT 2008 Sun Mar 9 00:55:43 GMT 2008 <== Stall Sun Mar 9 00:56:11 GMT 2008 Sun Mar 9 00:56:12 GMT 2008 Sun Mar 9 00:56:13 GMT 2008 Sun Mar 9 00:56:14 GMT 2008 Sun Mar 9 00:56:15 GMT 2008 [/date.log] [timec output] Sun Mar 9 00:55:40 2008 Sun Mar 9 00:55:41 2008 Sun Mar 9 00:55:42 2008 stalled for 2 seconds Sun Mar 9 00:55:44 2008 stalled for 5 seconds Sun Mar 9 00:55:49 2008 stalled for 2 seconds Sun Mar 9 00:55:51 2008 stalled for 2 seconds Sun Mar 9 00:55:53 2008 Sun Mar 9 00:55:54 2008 Sun Mar 9 00:55:55 2008 Sun Mar 9 00:55:56 2008 Sun Mar 9 00:55:57 2008 Sun Mar 9 00:55:58 2008 Sun Mar 9 00:55:59 2008 Sun Mar 9 00:56:00 2008 Sun Mar 9 00:56:01 2008 Sun Mar 9 00:56:02 2008 Sun Mar 9 00:56:03 2008 Sun Mar 9 00:56:04 2008 Sun Mar 9 00:56:05 2008 Sun Mar 9 00:56:06 2008 Sun Mar 9 00:56:07 2008 Sun Mar 9 00:56:08 2008 Sun Mar 9 00:56:09 2008 Sun Mar 9 00:56:10 2008 Sun Mar 9 00:56:11 2008 Sun Mar 9 00:56:12 2008 Sun Mar 9 00:56:13 2008 Sun Mar 9 00:56:14 2008 Sun Mar 9 00:56:15 2008 [/timec output] As the list ate the attachment, the output from the lock profile can be found here:- ftp://ftp1.multiplay.co.uk/pub/other/freebsd-7.0-rrdtool-stall.zip Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From ivoras at freebsd.org Mon Mar 10 10:48:36 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Mar 10 10:48:41 2008 Subject: pgbench results Message-ID: Hi, Has anyone been able to replicate results from http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html, or get close to the performance described there on similar hardware (e.g. thousands of transactions/s) ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080310/0763b683/signature.pgp From alan.bryan at yahoo.com Mon Mar 10 15:35:37 2008 From: alan.bryan at yahoo.com (alan bryan) Date: Mon Mar 10 15:35:42 2008 Subject: pgbench results In-Reply-To: Message-ID: <571396.91912.qm@web50512.mail.re2.yahoo.com> --- Ivan Voras wrote: > Hi, > > Has anyone been able to replicate results from > http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html, > or > get close to the performance described there on > similar hardware (e.g. > thousands of transactions/s) ? > Here's mine for a somewhat similar setup. FreeBSD 7.0 PostgreSQL 8.3 2x Intel Xeon 2.33GHZ quad cores (8 cores total), 8GB RAM, 250GB RAID 10 (4x WD Raptor 10K drives). Non-default settings: max_connections = 200 shared_buffers = 1900MB wal_buffers = 1024kB checkpoint_segments = 192 checkpoint_timeout = 30min createdb testdb pgbench -i -s 100 testdb # pgbench -c 100 -t 100000 testdb starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 100 number of transactions per client: 100000 number of transactions actually processed: 10000000/10000000 tps = 1650.806584 (including connections establishing) tps = 1650.905036 (excluding connections establishing) So, not as high as his numbers but then I've got less RAM, one less drive spindle in my array (2 vs. 3 in performance for the raid 10 setup), SATA vs. SCSI, he's got 512MB of controller cache vs my 128MB. --Alan ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From ivoras at freebsd.org Tue Mar 11 16:10:37 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Tue Mar 11 16:10:41 2008 Subject: pgbench results In-Reply-To: <571396.91912.qm@web50512.mail.re2.yahoo.com> References: <571396.91912.qm@web50512.mail.re2.yahoo.com> Message-ID: http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html alan bryan wrote: > Here's mine for a somewhat similar setup. > FreeBSD 7.0 PostgreSQL 8.3 > 2x Intel Xeon 2.33GHZ quad cores (8 cores total), 8GB > RAM, 250GB RAID 10 (4x WD Raptor 10K drives). > > Non-default settings: > > > max_connections = 200 > shared_buffers = 1900MB > wal_buffers = 1024kB > checkpoint_segments = 192 > checkpoint_timeout = 30min > > createdb testdb > pgbench -i -s 100 testdb > > # pgbench -c 100 -t 100000 testdb > starting vacuum...end. > transaction type: TPC-B (sort of) > scaling factor: 100 > number of clients: 100 > number of transactions per client: 100000 > number of transactions actually processed: > 10000000/10000000 > tps = 1650.806584 (including connections establishing) > tps = 1650.905036 (excluding connections establishing) > > So, not as high as his numbers but then I've got less > RAM, one less drive spindle in my array (2 vs. 3 in > performance for the raid 10 setup), SATA vs. SCSI, > he's got 512MB of controller cache vs my 128MB. The thing is - I *do* have a similar setup here: HP DL370 G5, 2x4-core 1.86 GHz, 4 GB RAM, 6 drives in RAID10, 512 MB cache (can pull > 200 MB/s off the array), with all settings like in the posted link except shared_buffer=1900 MB, and I "only" get this: tps = 2834.026175 (including connections establishing) tps = 2839.080739 (excluding connections establishing) This is still far bellow ~~ 4500 trans/s from the link and I wonder if my results are within what I should be getting. The benchmark in the link above was done with faster CPUs (but I'm not CPU bound - at least 30% idle), but with 3 times the memory and I'm guessing more memory would help here, but I'm not sure. What's strange is that toggling synchronous_commit doesn't have a significant effect on performance (it does increase CPU idle time). With synchronous_commit=off, I get: tps = 2886.980477 (including connections establishing) tps = 2891.776081 (excluding connections establishing) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080311/21b2a1cb/signature.pgp From markir at paradise.net.nz Wed Mar 12 00:25:57 2008 From: markir at paradise.net.nz (Mark Kirkwood) Date: Wed Mar 12 00:26:02 2008 Subject: pgbench results In-Reply-To: References: <571396.91912.qm@web50512.mail.re2.yahoo.com> Message-ID: <47D71F6F.2090600@paradise.net.nz> Ivan Voras wrote: > > The thing is - I *do* have a similar setup here: HP DL370 G5, 2x4-core > 1.86 GHz, 4 GB RAM, 6 drives in RAID10, 512 MB cache (can pull > 200 > MB/s off the array), with all settings like in the posted link except > shared_buffer=1900 MB, and I "only" get this: > > tps = 2834.026175 (including connections establishing) > tps = 2839.080739 (excluding connections establishing) > > This is still far bellow ~~ 4500 trans/s from the link and I wonder if > my results are within what I should be getting. The benchmark in the > link above was done with faster CPUs (but I'm not CPU bound - at least > 30% idle), but with 3 times the memory and I'm guessing more memory > would help here, but I'm not sure. > > What's strange is that toggling synchronous_commit doesn't have a > significant effect on performance (it does increase CPU idle time). With > synchronous_commit=off, I get: > > tps = 2886.980477 (including connections establishing) > tps = 2891.776081 (excluding connections establishing) > > The article refers to a controller with a battery backed write cache - that could easily explain the difference if you do not have one (he's paying nothing for fsync wheres you are). regards Mark From markir at paradise.net.nz Wed Mar 12 03:42:58 2008 From: markir at paradise.net.nz (Mark Kirkwood) Date: Wed Mar 12 03:43:02 2008 Subject: pgbench results In-Reply-To: <47D71F6F.2090600@paradise.net.nz> References: <571396.91912.qm@web50512.mail.re2.yahoo.com> <47D71F6F.2090600@paradise.net.nz> Message-ID: <47D7512F.5020006@paradise.net.nz> Mark Kirkwood wrote: > Ivan Voras wrote: >> >> The thing is - I *do* have a similar setup here: HP DL370 G5, 2x4-core >> 1.86 GHz, 4 GB RAM, 6 drives in RAID10, 512 MB cache (can pull > 200 >> MB/s off the array), with all settings like in the posted link except >> shared_buffer=1900 MB, and I "only" get this: >> >> tps = 2834.026175 (including connections establishing) >> tps = 2839.080739 (excluding connections establishing) >> >> This is still far bellow ~~ 4500 trans/s from the link and I wonder if >> my results are within what I should be getting. The benchmark in the >> link above was done with faster CPUs (but I'm not CPU bound - at least >> 30% idle), but with 3 times the memory and I'm guessing more memory >> would help here, but I'm not sure. >> >> What's strange is that toggling synchronous_commit doesn't have a >> significant effect on performance (it does increase CPU idle time). With >> synchronous_commit=off, I get: >> >> tps = 2886.980477 (including connections establishing) >> tps = 2891.776081 (excluding connections establishing) >> >> > > The article refers to a controller with a battery backed write cache - > that could easily explain the difference if you do not have one (he's > paying nothing for fsync wheres you are). > Hmm - somehow read right past the bit where you say you have a 512MB cache - sorry! However, worth checking it is set to write-back rather than write-through. Cheers Mark From ivoras at freebsd.org Wed Mar 12 10:20:02 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Wed Mar 12 10:20:06 2008 Subject: pgbench results In-Reply-To: <47D7512F.5020006@paradise.net.nz> References: <571396.91912.qm@web50512.mail.re2.yahoo.com> <47D71F6F.2090600@paradise.net.nz> <47D7512F.5020006@paradise.net.nz> Message-ID: <9bbcef730803120255w531636d5h642e407aa1881b6a@mail.gmail.com> On 12/03/2008, Mark Kirkwood wrote: > Hmm - somehow read right past the bit where you say you have a 512MB > cache - sorry! However, worth checking it is set to write-back rather > than write-through. As far as I can see it is set to write-through (though the HP's array configuration utility isn't explicit about it, everything performance-wise than can be turned on is turned on, including write cache). From jroberson at chesapeake.net Thu Mar 13 00:52:29 2008 From: jroberson at chesapeake.net (Jeff Roberson) Date: Thu Mar 13 00:52:34 2008 Subject: pgbench results In-Reply-To: <9bbcef730803120255w531636d5h642e407aa1881b6a@mail.gmail.com> References: <571396.91912.qm@web50512.mail.re2.yahoo.com> <47D71F6F.2090600@paradise.net.nz> <47D7512F.5020006@paradise.net.nz> <9bbcef730803120255w531636d5h642e407aa1881b6a@mail.gmail.com> Message-ID: <20080312145307.P1091@desktop> On Wed, 12 Mar 2008, Ivan Voras wrote: > On 12/03/2008, Mark Kirkwood wrote: > >> Hmm - somehow read right past the bit where you say you have a 512MB >> cache - sorry! However, worth checking it is set to write-back rather >> than write-through. > > As far as I can see it is set to write-through (though the HP's array > configuration utility isn't explicit about it, everything > performance-wise than can be turned on is turned on, including write > cache). What kernel are you running? > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" > From ivoras at freebsd.org Thu Mar 13 09:55:41 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Mar 13 09:55:45 2008 Subject: pgbench results In-Reply-To: <9bbcef730803120255w531636d5h642e407aa1881b6a@mail.gmail.com> References: <571396.91912.qm@web50512.mail.re2.yahoo.com> <47D71F6F.2090600@paradise.net.nz> <47D7512F.5020006@paradise.net.nz> <9bbcef730803120255w531636d5h642e407aa1881b6a@mail.gmail.com> Message-ID: Ivan Voras wrote: > On 12/03/2008, Mark Kirkwood wrote: > >> Hmm - somehow read right past the bit where you say you have a 512MB >> cache - sorry! However, worth checking it is set to write-back rather >> than write-through. > > As far as I can see it is set to write-through (though the HP's array Sorry, this should be "it is NOT set to write-through" > configuration utility isn't explicit about it, everything > performance-wise than can be turned on is turned on, including write > cache). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080313/8be8c349/signature.pgp From ivoras at freebsd.org Thu Mar 13 09:55:59 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Mar 13 09:56:02 2008 Subject: pgbench results In-Reply-To: <20080312145307.P1091@desktop> References: <571396.91912.qm@web50512.mail.re2.yahoo.com> <47D71F6F.2090600@paradise.net.nz> <47D7512F.5020006@paradise.net.nz> <9bbcef730803120255w531636d5h642e407aa1881b6a@mail.gmail.com> <20080312145307.P1091@desktop> Message-ID: <9bbcef730803130255p68d35240pfc627fe5acd5b659@mail.gmail.com> On 13/03/2008, Jeff Roberson wrote: > > On Wed, 12 Mar 2008, Ivan Voras wrote: > > > On 12/03/2008, Mark Kirkwood wrote: > > > >> Hmm - somehow read right past the bit where you say you have a 512MB > >> cache - sorry! However, worth checking it is set to write-back rather > >> than write-through. > > > > As far as I can see it is set to write-through (though the HP's array > > configuration utility isn't explicit about it, everything > > performance-wise than can be turned on is turned on, including write > > cache). > > What kernel are you running? 7-STABLE since Feb 29, amd64+ULE. From hugoboy at inbox.lv Fri Mar 14 11:12:26 2008 From: hugoboy at inbox.lv (hugoboy@inbox.lv) Date: Fri Mar 14 11:12:31 2008 Subject: FreeBSD 7.0 bridge tuning Message-ID: <1205491910.47da58c6ecb6a@www.inbox.lv> Hello! I'm trying to tune FreeBSD 7.0 bridge. Environment: Server - 2 x Xeon 3GHz, 2 x Gb LAN(em driver) + 1 LAN for management, 1GB RAM. Testers -2 x Sunrise Telecom 100Mbit Ethernet testers for traffic generation. What I have intended to achieve is to substitute proprietary traffic shaper Allot with FreeBSD traffic shaper(Bridge + PF + ALTQ). The minimum task is to make FreeBSD shaper to perform perfectly with 100Mbit traffic in all spectrum of packet lengths (from 64 bytes to at least 1518 bytes) The situation now: with pf turned off - there is no problem, bridge throughput is 100Mbit/s no packet loss (starting from 64 byte packets) With pf on I have statistics: packet lengt -> Mbit/s without packet loss 64 -> 46 100 -> 66 150 -> 94 >200 -> 100 Lower configuration of kernel/sysctl is displayed. I don't know what else can I tune? It seems to me that bottleneck is somewhere around pf/kernel buffers of packet headers. I read somewhere that in bridging packet payload does not travel through all stack - just header is evaluated. In case of 64 byte packets in the same time unit there are more packets for the same bandwith on interfaces and as plain layer2 bridge performs 100Mbit/s with no problem the problem is above layer2 :) btw: kern.polling.enable=1 does not help - at packetlength 64 bytes performance is 2x worse than with interrupts. kernel: --------------------------- cpu I686_CPU ident ALLOT # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_ULE # ULE scheduler #options SCHED_4BSD # 4BSD scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking #options INET6 # IPv6 communications protocols #options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFS_ROOT # NFS usable as /, requires NFSCLIENT options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP THIS!] options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options ADAPTIVE_GIANT # Giant mutex is adaptive. options STOP_NMI # Stop CPUS using NMI instead of IPI options AUDIT # Security event auditing options ALTQ options ALTQ_CBQ options ALTQ_RED options ALTQ_RIO options ALTQ_HFSC options ALTQ_CDNR options ALTQ_PRIQ options ALTQ_NOPCC options HZ=1000 options DEVICE_POLLING options IPSTEALTH options ZERO_COPY_SOCKETS options MPTABLE_FORCE_HTT # Enable HTT CPUs with the MP Table options IPI_PREEMPTION # To make an SMP kernel, the next two lines are needed options SMP # Symmetric MultiProcessor Kernel device apic # I/O APIC -------------------------------- /etc/sysctl.conf #kern.polling.enable=1 kern.ipc.nmbcluster=32768 kern.ipc.maxsockbufs=2097152 kern.ipc.somaxconn=8192 kern.maxfiles=65536 kern.maxfilesperproc=32768 net.inet.tcp.delayed_ack=0 net.inet.tcp.sendspace=65535 net.inet.udp.recvspace=65535 net.inet.udp.maxdgram=57344 net.local.stream.recvspace=65535 net.local.stream.sendspace=65535 kern.polling.user_frac=20 net.isr.direct=0 net.inet.ip.forwarding=1 ------------------------------- P.S. I tried pfSense, but as we have used Allot before - we need to see queue statistics in graphs per queue, pfSense just offers numbers.. Seems to me that pFsense is good for many things but not for bridge+traffic shapeing - correct me if I'm wrong. Best regards, Ugis From hugoboy at inbox.lv Fri Mar 14 12:56:05 2008 From: hugoboy at inbox.lv (hugoboy@inbox.lv) Date: Fri Mar 14 12:56:10 2008 Subject: FreeBSD 7.0 bridge tuning Message-ID: <1205499361.47da75e1ac733@www.inbox.lv> I chose pf+altq for traffic shaper solution, because it seems to better match my needs. I use ipfw where FW is needed, but from the point of easy administration pf+altq is better for traffic shaper. So I have not tested shaping performance with ipfw as I have chosen pf this time. If it will turn out that it is not possible to achieve good enough results this way - I'll try ipfw+dummynet. Server is i386 based. Still I am quite sure that it is possible to tune this configuration, but need to find bottleneck... Ugis >Hi, >Just for my information, what performance if you replace pf to ipfw ? >and what freebsd v7.0 version ? i386 or amd64 ? >Regards >Rmkml On Fri, 14 Mar 2008, hugoboy@inbox.lv wrote: > Date: Fri, 14 Mar 2008 12:51:50 +0200 > From: hugoboy@inbox.lv > To: freebsd-performance@freebsd.org > Subject: FreeBSD 7.0 bridge tuning > > Hello! > > I'm trying to tune FreeBSD 7.0 bridge. > > Environment: > Server - 2 x Xeon 3GHz, 2 x Gb LAN(em driver) + 1 LAN for management, > 1GB RAM. > Testers -2 x Sunrise Telecom 100Mbit Ethernet testers for traffic > generation. > > What I have intended to achieve is to substitute proprietary traffic > shaper Allot with FreeBSD traffic shaper(Bridge + PF + ALTQ). > The minimum task is to make FreeBSD shaper to perform perfectly with > 100Mbit traffic in all spectrum of packet lengths (from 64 bytes to > at least 1518 bytes) > > The situation now: > with pf turned off - there is no problem, bridge throughput is > 100Mbit/s no packet loss (starting from 64 byte packets) > > With pf on I have statistics: > packet lengt -> Mbit/s without packet loss > 64 -> 46 > 100 -> 66 > 150 -> 94 >> 200 -> 100 > > Lower configuration of kernel/sysctl is displayed. > > I don't know what else can I tune? > > It seems to me that bottleneck is somewhere around pf/kernel buffers > of packet headers. I read somewhere that in bridging packet payload > does not travel through all stack - just header is evaluated. > In case of 64 byte packets in the same time unit there are more > packets for the same bandwith on interfaces and as plain layer2 > bridge performs 100Mbit/s with no problem > the problem is above layer2 :) > > btw: kern.polling.enable=1 does not help - at packetlength 64 bytes > performance is 2x worse than with interrupts. > kernel: > --------------------------- > > cpu I686_CPU > ident ALLOT > > # To statically compile in device wiring instead of > /boot/device.hints > #hints "GENERIC.hints" # Default places to look for > devices. > > makeoptions DEBUG=-g # Build kernel with gdb(1) > debug symbols > > options SCHED_ULE # ULE scheduler > #options SCHED_4BSD # 4BSD scheduler > options PREEMPTION # Enable kernel thread > preemption > options INET # InterNETworking > #options INET6 # IPv6 communications > protocols > #options SCTP # Stream Control Transmission > Protocol > options FFS # Berkeley Fast Filesystem > options SOFTUPDATES # Enable FFS soft updates > support > options UFS_ACL # Support for access control > lists > options UFS_DIRHASH # Improve performance on big > directories > options UFS_GJOURNAL # Enable gjournal-based UFS > journaling > options MD_ROOT # MD is a potential root > device > options NFSCLIENT # Network Filesystem Client > options NFSSERVER # Network Filesystem Server > options NFS_ROOT # NFS usable as /, requires > NFSCLIENT > options MSDOSFS # MSDOS Filesystem > options CD9660 # ISO 9660 Filesystem > options PROCFS # Process filesystem > (requires PSEUDOFS) > options PSEUDOFS # Pseudo-filesystem framework > options GEOM_PART_GPT # GUID Partition Tables. > options GEOM_LABEL # Provides labelization > options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP > THIS!] > options COMPAT_FREEBSD4 # Compatible with FreeBSD4 > options COMPAT_FREEBSD5 # Compatible with FreeBSD5 > options COMPAT_FREEBSD6 # Compatible with FreeBSD6 > options SCSI_DELAY=5000 # Delay (in ms) before > probing SCSI > options KTRACE # ktrace(1) support > options SYSVSHM # SYSV-style shared memory > options SYSVMSG # SYSV-style message queues > options SYSVSEM # SYSV-style semaphores > options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B > real-time extensions > options KBD_INSTALL_CDEV # install a CDEV entry in > /dev > options ADAPTIVE_GIANT # Giant mutex is adaptive. > options STOP_NMI # Stop CPUS using NMI instead > of IPI > options AUDIT # Security event auditing > > options ALTQ > options ALTQ_CBQ > options ALTQ_RED > options ALTQ_RIO > options ALTQ_HFSC > options ALTQ_CDNR > options ALTQ_PRIQ > options ALTQ_NOPCC > options HZ=1000 > options DEVICE_POLLING > options IPSTEALTH > options ZERO_COPY_SOCKETS > options MPTABLE_FORCE_HTT # Enable HTT CPUs with the MP Table > options IPI_PREEMPTION > > # To make an SMP kernel, the next two lines are needed > options SMP # Symmetric MultiProcessor > Kernel > device apic # I/O APIC > -------------------------------- > > /etc/sysctl.conf > #kern.polling.enable=1 > kern.ipc.nmbcluster=32768 > kern.ipc.maxsockbufs=2097152 > kern.ipc.somaxconn=8192 > kern.maxfiles=65536 > kern.maxfilesperproc=32768 > net.inet.tcp.delayed_ack=0 > net.inet.tcp.sendspace=65535 > net.inet.udp.recvspace=65535 > net.inet.udp.maxdgram=57344 > net.local.stream.recvspace=65535 > net.local.stream.sendspace=65535 > kern.polling.user_frac=20 > net.isr.direct=0 > net.inet.ip.forwarding=1 > ------------------------------- > > P.S. I tried pfSense, but as we have used Allot before - we need to > see queue statistics in graphs per queue, pfSense just offers > numbers.. > Seems to me that pFsense is good for many things but not for > bridge+traffic shapeing - correct me if I'm wrong. > > Best regards, > Ugis From killing at multiplay.co.uk Sun Mar 16 14:36:31 2008 From: killing at multiplay.co.uk (Steven Hartland) Date: Sun Mar 16 14:36:34 2008 Subject: rrdtool / mtr causing stalling on 7.0 References: <056601c8814c$516c0370$b6db87d4@multiplay.co.uk><20080308221441.E11432@fledge.watson.org> <006401c88181$25cf0e30$b6db87d4@multiplay.co.uk> Message-ID: <005301c88771$5b06b6c0$b6db87d4@multiplay.co.uk> Hi Kris I was wondering if you would be so kind as to take a look at the results below to see if they highlight anything that might be the cause of this performance issue. I've raised this on the rrdtool list and quite a few people seem able to run 10* the amount of updates on none FreeBSD systems without these disruptive system wide stalls. Given this and the statement on one of your papers saying you would be interested in any loads that don't run well FreeBSD I hoped you might be able to have a look at this and provide us with areas to focus on. Regards Steve ----- Original Message ----- From: "Steven Hartland" > ----- Original Message ----- > From: "Robert Watson" >> It looks like the attachment got lost on the way through the mailing list. >> >> I think the first starting point is: what sort of stall is this? Is it, for >> example, all network communication stalling, all disk I/O stalling, or the >> entire kernel and all processes stalling? The usual diagnostics are: >> >> - Does the machine stop responding to pings while stalled, and/or possibly >> "catch up" all at once when it recovers? >> >> - If you run the following loop on the machine without any network or console >> I/O, do you see gaps in time stamps: >> >> while (1) { >> sleep 1 >> date >> date.log >> } >> >> - If you write a short C program that looks a lot like the above loop, but >> logs time stamps into an in-memory buffer, and have it look for gaps in the >> sequence of >3 seconds, does it run across the stall? > > Thanks for the ideas Robert the output from the shell script > this shows significant gaps:- > Sun Mar 9 00:20:33 GMT 2008 > Sun Mar 9 00:20:34 GMT 2008 <== Stall > Sun Mar 9 00:21:09 GMT 2008 > Sun Mar 9 00:21:10 GMT 2008 > ... > Sun Mar 9 00:25:23 GMT 2008 > Sun Mar 9 00:25:24 GMT 2008 > Sun Mar 9 00:25:25 GMT 2008 > Sun Mar 9 00:25:27 GMT 2008 <== Stall > Sun Mar 9 00:25:53 GMT 2008 > Sun Mar 9 00:25:59 GMT 2008 > Sun Mar 9 00:26:00 GMT 2008 > > Running a ping along side shows no missed responses. > > Enabling lock profiling for the period changes the behaviour somewhat, > producing shorter but multiple stalls. > > Sun Mar 9 00:30:31 GMT 2008 > Sun Mar 9 00:30:32 GMT 2008 > Sun Mar 9 00:30:34 GMT 2008 > Sun Mar 9 00:30:35 GMT 2008 > Sun Mar 9 00:30:36 GMT 2008 > Sun Mar 9 00:30:37 GMT 2008 > Sun Mar 9 00:30:38 GMT 2008 > Sun Mar 9 00:30:41 GMT 2008 > Sun Mar 9 00:30:42 GMT 2008 <== Stall > Sun Mar 9 00:30:44 GMT 2008 > Sun Mar 9 00:30:45 GMT 2008 <== Stall > Sun Mar 9 00:30:47 GMT 2008 <== Stall > Sun Mar 9 00:30:49 GMT 2008 > Sun Mar 9 00:30:50 GMT 2008 <== Stall > Sun Mar 9 00:30:52 GMT 2008 <== Stall > Sun Mar 9 00:30:54 GMT 2008 > Sun Mar 9 00:30:55 GMT 2008 > Sun Mar 9 00:30:56 GMT 2008 > Sun Mar 9 00:30:57 GMT 2008 <== Stall > Sun Mar 9 00:31:03 GMT 2008 <== Stall > Sun Mar 9 00:31:05 GMT 2008 > Sun Mar 9 00:31:06 GMT 2008 <== Stall > Sun Mar 9 00:31:08 GMT 2008 > Sun Mar 9 00:31:09 GMT 2008 > Sun Mar 9 00:31:10 GMT 2008 > Sun Mar 9 00:31:11 GMT 2008 <== Stall > Sun Mar 9 00:31:14 GMT 2008 > Sun Mar 9 00:31:15 GMT 2008 > Sun Mar 9 00:31:16 GMT 2008 <== Stall > Sun Mar 9 00:31:20 GMT 2008 > Sun Mar 9 00:31:21 GMT 2008 > Sun Mar 9 00:31:22 GMT 2008 > > Using the following c code we also see stalls: > #include > #include > #include > > int main( char **argv, int argc ) > { > time_t last = time( NULL ); > while ( 1 ) > { > time_t now = time( NULL ); > time_t diff = now - last; > if ( diff >= 2 ) > { > fprintf( stderr, "stalled for %d seconds\n", diff ); > } > fprintf( stderr, ctime( &now ) ); > last = now; > sleep( 1 ); > } > > exit( 0 ); > } > > [date.log] > Sun Mar 9 00:55:40 GMT 2008 > Sun Mar 9 00:55:43 GMT 2008 <== Stall > Sun Mar 9 00:56:11 GMT 2008 > Sun Mar 9 00:56:12 GMT 2008 > Sun Mar 9 00:56:13 GMT 2008 > Sun Mar 9 00:56:14 GMT 2008 > Sun Mar 9 00:56:15 GMT 2008 > [/date.log] > > [timec output] > Sun Mar 9 00:55:40 2008 > Sun Mar 9 00:55:41 2008 > Sun Mar 9 00:55:42 2008 > stalled for 2 seconds > Sun Mar 9 00:55:44 2008 > stalled for 5 seconds > Sun Mar 9 00:55:49 2008 > stalled for 2 seconds > Sun Mar 9 00:55:51 2008 > stalled for 2 seconds > Sun Mar 9 00:55:53 2008 > Sun Mar 9 00:55:54 2008 > Sun Mar 9 00:55:55 2008 > Sun Mar 9 00:55:56 2008 > Sun Mar 9 00:55:57 2008 > Sun Mar 9 00:55:58 2008 > Sun Mar 9 00:55:59 2008 > Sun Mar 9 00:56:00 2008 > Sun Mar 9 00:56:01 2008 > Sun Mar 9 00:56:02 2008 > Sun Mar 9 00:56:03 2008 > Sun Mar 9 00:56:04 2008 > Sun Mar 9 00:56:05 2008 > Sun Mar 9 00:56:06 2008 > Sun Mar 9 00:56:07 2008 > Sun Mar 9 00:56:08 2008 > Sun Mar 9 00:56:09 2008 > Sun Mar 9 00:56:10 2008 > Sun Mar 9 00:56:11 2008 > Sun Mar 9 00:56:12 2008 > Sun Mar 9 00:56:13 2008 > Sun Mar 9 00:56:14 2008 > Sun Mar 9 00:56:15 2008 > [/timec output] > > > As the list ate the attachment, the output from the lock profile > can be found here:- > ftp://ftp1.multiplay.co.uk/pub/other/freebsd-7.0-rrdtool-stall.zip > > Regards > Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From stefan.lambrev at moneybookers.com Mon Mar 17 08:14:21 2008 From: stefan.lambrev at moneybookers.com (Stefan Lambrev) Date: Mon Mar 17 08:14:24 2008 Subject: FreeBSD 7.0 bridge tuning In-Reply-To: <1205491910.47da58c6ecb6a@www.inbox.lv> References: <1205491910.47da58c6ecb6a@www.inbox.lv> Message-ID: <47DE2840.803@moneybookers.com> Greetings, hugoboy@inbox.lv wrote: > Hello! > > I'm trying to tune FreeBSD 7.0 bridge. > You may want to check this thread - http://lists.freebsd.org/pipermail/freebsd-current/2008-January/082751.html > Environment: > Server - 2 x Xeon 3GHz, 2 x Gb LAN(em driver) + 1 LAN for management, > 1GB RAM. > Can you tell us the exact CPU model? Is it dual core Xeon? It's not clear how many cores you have ... > Testers -2 x Sunrise Telecom 100Mbit Ethernet testers for traffic > generation. > > What I have intended to achieve is to substitute proprietary traffic > shaper Allot with FreeBSD traffic shaper(Bridge + PF + ALTQ). > The minimum task is to make FreeBSD shaper to perform perfectly with > 100Mbit traffic in all spectrum of packet lengths (from 64 bytes to > at least 1518 bytes) > > The situation now: > with pf turned off - there is no problem, bridge throughput is > 100Mbit/s no packet loss (starting from 64 byte packets) > > With pf on I have statistics: > packet lengt -> Mbit/s without packet loss > 64 -> 46 > 100 -> 66 > 150 -> 94 > >> 200 -> 100 >> > > How many packets per second do you transmit? PF have some known limitations, hopefully they will be addressed in 8-current and back-ported someday to 7-STABLE. > Lower configuration of kernel/sysctl is displayed. > > I don't know what else can I tune? > > It seems to me that bottleneck is somewhere around pf/kernel buffers > of packet headers. I read somewhere that in bridging packet payload > does not travel through all stack - just header is evaluated. > In case of 64 byte packets in the same time unit there are more > packets for the same bandwith on interfaces and as plain layer2 > bridge performs 100Mbit/s with no problem > the problem is above layer2 :) > > btw: kern.polling.enable=1 does not help - at packetlength 64 bytes > performance is 2x worse than with interrupts. > I noticed this too - polling is not very helpful with em driver. It reduce the load but dropped packets are more with polling. In my situation increasing kern.hz to 3000 yielded best results, you can try to tune this. > kernel: > --------------------------- > > cpu I686_CPU > ident ALLOT > > # To statically compile in device wiring instead of > /boot/device.hints > #hints "GENERIC.hints" # Default places to look for > devices. > > makeoptions DEBUG=-g # Build kernel with gdb(1) > debug symbols > > options SCHED_ULE # ULE scheduler > #options SCHED_4BSD # 4BSD scheduler > options PREEMPTION # Enable kernel thread > preemption > options INET # InterNETworking > #options INET6 # IPv6 communications > protocols > #options SCTP # Stream Control Transmission > Protocol > options FFS # Berkeley Fast Filesystem > options SOFTUPDATES # Enable FFS soft updates > support > options UFS_ACL # Support for access control > lists > options UFS_DIRHASH # Improve performance on big > directories > options UFS_GJOURNAL # Enable gjournal-based UFS > journaling > options MD_ROOT # MD is a potential root > device > options NFSCLIENT # Network Filesystem Client > options NFSSERVER # Network Filesystem Server > options NFS_ROOT # NFS usable as /, requires > NFSCLIENT > options MSDOSFS # MSDOS Filesystem > options CD9660 # ISO 9660 Filesystem > options PROCFS # Process filesystem > (requires PSEUDOFS) > options PSEUDOFS # Pseudo-filesystem framework > options GEOM_PART_GPT # GUID Partition Tables. > options GEOM_LABEL # Provides labelization > options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP > THIS!] > options COMPAT_FREEBSD4 # Compatible with FreeBSD4 > options COMPAT_FREEBSD5 # Compatible with FreeBSD5 > options COMPAT_FREEBSD6 # Compatible with FreeBSD6 > options SCSI_DELAY=5000 # Delay (in ms) before > probing SCSI > options KTRACE # ktrace(1) support > options SYSVSHM # SYSV-style shared memory > options SYSVMSG # SYSV-style message queues > options SYSVSEM # SYSV-style semaphores > options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B > real-time extensions > options KBD_INSTALL_CDEV # install a CDEV entry in > /dev > options ADAPTIVE_GIANT # Giant mutex is adaptive. > options STOP_NMI # Stop CPUS using NMI instead > of IPI > options AUDIT # Security event auditing > > options ALTQ > options ALTQ_CBQ > options ALTQ_RED > options ALTQ_RIO > options ALTQ_HFSC > options ALTQ_CDNR > options ALTQ_PRIQ > options ALTQ_NOPCC > options HZ=1000 > options DEVICE_POLLING > options IPSTEALTH > options ZERO_COPY_SOCKETS > options MPTABLE_FORCE_HTT # Enable HTT CPUs with the MP Table > options IPI_PREEMPTION > > # To make an SMP kernel, the next two lines are needed > options SMP # Symmetric MultiProcessor > Kernel > device apic # I/O APIC > -------------------------------- > > /etc/sysctl.conf > #kern.polling.enable=1 > kern.ipc.nmbcluster=32768 > kern.ipc.maxsockbufs=2097152 > kern.ipc.somaxconn=8192 > kern.maxfiles=65536 > kern.maxfilesperproc=32768 > net.inet.tcp.delayed_ack=0 > net.inet.tcp.sendspace=65535 > net.inet.udp.recvspace=65535 > net.inet.udp.maxdgram=57344 > net.local.stream.recvspace=65535 > net.local.stream.sendspace=65535 > kern.polling.user_frac=20 > net.isr.direct=0 > net.inet.ip.forwarding=1 > ------------------------------- > > P.S. I tried pfSense, but as we have used Allot before - we need to > see queue statistics in graphs per queue, pfSense just offers > numbers.. > Seems to me that pFsense is good for many things but not for > bridge+traffic shapeing - correct me if I'm wrong. > > Best regards, > Ugis > > > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" > From amin.scg at gmail.com Mon Mar 17 09:24:50 2008 From: amin.scg at gmail.com (Aminuddin Abdullah) Date: Mon Mar 17 09:24:54 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <20080210120013.4C3D116A421@hub.freebsd.org> References: <20080210120013.4C3D116A421@hub.freebsd.org> Message-ID: <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> I have just upgraded 5 of my machines to V7 from 6.3 and then realized that all the machines has a high CPU usage. Almost all of them using 80%-90% CPU with more than 8000 connections. Using previous 6.3, it only uses 40-50% CPU with the same kind of connections. Using top -S, I can see that swi5: +, PID 17 process is using 30% of CPU time. What is this process? All the machines are Intel C2D 6300 except one which is a AMD 4000+. Is this normal for V7? How do I downgrade to 6.3 if this V7 killing the CPU? TIA From valerio.daelli at gmail.com Mon Mar 17 10:09:35 2008 From: valerio.daelli at gmail.com (Valerio Daelli) Date: Mon Mar 17 10:09:38 2008 Subject: Bad performance of 7.0 nfs client with Solaris nfs server In-Reply-To: <47BEBBCF.7040907@freebsd.org> References: <27dbfc8c0802190243y113d3059yd0c602850a4dbd6b@mail.gmail.com> <47BB33AD.1050005@FreeBSD.org> <27dbfc8c0802200323r13f69905l4940d0d5accd1eb1@mail.gmail.com> <47BC25C5.1000300@freebsd.org> <27dbfc8c0802200705k482152d4h1bf6e63de24edf59@mail.gmail.com> <47BC5325.8070504@freebsd.org> <27dbfc8c0802210031q3590cafbnbe31698ebdc2d1f2@mail.gmail.com> <47BEBBCF.7040907@freebsd.org> Message-ID: <27dbfc8c0803170309p372e5904vef49b20eff2f4899@mail.gmail.com> > Just now got a chance to look at the trace. It looks like FILE_SYNC is > enabled on the write, which will cause the filer to fully commit the > block (8k in this case) to disk before replying. This will usually hurt > performance. I'm not certain where it is getting set, but you might try > some mount options, like 'async' mode. This might also be a bug in > FreeBSD that is forcing it to be enabled all the time. I'll look > through some source code and see what I can find. > > Eric > > Hi I have yes solved this issue and I have another test. Now the mount is sync (no async) and the iozone includes the -D flag. Now the write performance boosts from 3MB/s to 30MB/s. --- root@bsd7:~ iozone -D -+q 1 -i 0 -i 1 -r 2048 -n 2048 -g 2G -Raceb iozone.xls -f /mnt/nest.ifom-ieo-campus.it/iozone/file.tmp Iozone: Performance Test of File I/O Version $Revision: 3.283 $ Compiled for 32 bit mode. Build: freebsd Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong. Run began: Mon Mar 17 11:06:28 2008 Using msync(MS_ASYNC) on mmap files Delay 1 seconds between tests enabled. Record Size 2048 KB Using minimum file size of 2048 kilobytes. Using maximum file size of 2097152 kilobytes. Excel chart generation enabled Auto Mode Include close in write timing Include fsync in write timing Command line used: iozone -D -+q 1 -i 0 -i 1 -r 2048 -n 2048 -g 2G -Raceb iozone.xls -f /mnt/nest.ifom-ieo-campus.it/iozone/file.tmp Output is in Kbytes/sec Time Resolution = 0.000004 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read 2048 2048 49419 49755 629565 632905 4096 2048 7713 47431 625536 616224 8192 2048 28479 49564 630012 620276 16384 2048 26492 49515 631681 621500 32768 2048 13030 49572 631771 617552 65536 2048 24907 37586^C --- Notice that now we have using msync(MS_ASYNC) on mmap files (not a kernel expert so not sure if it is related to our problem). Without the -D flag we get 3MB/s with iozone. Thanks for you help! Valerio Daelli From valerio.daelli at gmail.com Mon Mar 17 10:10:48 2008 From: valerio.daelli at gmail.com (Valerio Daelli) Date: Mon Mar 17 10:10:51 2008 Subject: Bad performance of 7.0 nfs client with Solaris nfs server In-Reply-To: <27dbfc8c0803170309p372e5904vef49b20eff2f4899@mail.gmail.com> References: <27dbfc8c0802190243y113d3059yd0c602850a4dbd6b@mail.gmail.com> <47BB33AD.1050005@FreeBSD.org> <27dbfc8c0802200323r13f69905l4940d0d5accd1eb1@mail.gmail.com> <47BC25C5.1000300@freebsd.org> <27dbfc8c0802200705k482152d4h1bf6e63de24edf59@mail.gmail.com> <47BC5325.8070504@freebsd.org> <27dbfc8c0802210031q3590cafbnbe31698ebdc2d1f2@mail.gmail.com> <47BEBBCF.7040907@freebsd.org> <27dbfc8c0803170309p372e5904vef49b20eff2f4899@mail.gmail.com> Message-ID: <27dbfc8c0803170310x4a6ea1b6qfd6d752fc98259cf@mail.gmail.com> > > I have yes solved this issue and I have another test. ^^^ I haven't yet solved this issue Sorry. > Now the mount is sync (no async) and the iozone includes > the -D flag. > Now the write performance boosts from 3MB/s to 30MB/s. > > --- > root@bsd7:~ iozone -D -+q 1 -i 0 -i 1 -r 2048 -n 2048 -g 2G -Raceb > iozone.xls -f /mnt/nest.ifom-ieo-campus.it/iozone/file.tmp > > Iozone: Performance Test of File I/O > Version $Revision: 3.283 $ > Compiled for 32 bit mode. > Build: freebsd > > Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins > Al Slater, Scott Rhine, Mike Wisner, Ken Goss > Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, > Randy Dunlap, Mark Montague, Dan Million, > Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, > Erik Habbinga, Kris Strecker, Walter Wong. > > Run began: Mon Mar 17 11:06:28 2008 > > Using msync(MS_ASYNC) on mmap files > > Delay 1 seconds between tests enabled. > Record Size 2048 KB > > Using minimum file size of 2048 kilobytes. > Using maximum file size of 2097152 kilobytes. > > Excel chart generation enabled > Auto Mode > Include close in write timing > Include fsync in write timing > Command line used: iozone -D -+q 1 -i 0 -i 1 -r 2048 -n 2048 -g 2G > -Raceb iozone.xls -f /mnt/nest.ifom-ieo-campus.it/iozone/file.tmp > > Output is in Kbytes/sec > Time Resolution = 0.000004 seconds. > > Processor cache size set to 1024 Kbytes. > Processor cache line size set to 32 bytes. > > File stride size set to 17 * record size. > random > random bkwd record stride > KB reclen write rewrite read reread read > write read rewrite read > 2048 2048 49419 49755 629565 632905 > 4096 2048 7713 47431 625536 616224 > 8192 2048 28479 49564 630012 620276 > 16384 2048 26492 49515 631681 621500 > 32768 2048 13030 49572 631771 617552 > 65536 2048 24907 37586^C > --- > > Notice that now we have using msync(MS_ASYNC) on mmap files > (not a kernel expert so not sure if it is related to our problem). > Without the -D flag we get 3MB/s with iozone. > Thanks for you help! > > Valerio Daelli > From ivoras at freebsd.org Mon Mar 17 18:44:31 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Mar 17 18:44:44 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> References: <20080210120013.4C3D116A421@hub.freebsd.org> <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> Message-ID: Aminuddin Abdullah wrote: > I have just upgraded 5 of my machines to V7 from 6.3 and then realized that > all the machines has a high CPU usage. Almost all of them using 80%-90% CPU > with more than 8000 connections. Using previous 6.3, it only uses 40-50% CPU > with the same kind of connections. > > Using top -S, I can see that swi5: +, PID 17 process is using 30% of CPU > time. What is this process? "swi" stands for "software interrupt", but to find out which one you'll need to give more information about the systems. What are they running? Any routing, firewall? Check with "vmstat -ia" to see if you have an interrupt storm. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080317/be922dd9/signature.pgp From rwatson at FreeBSD.org Tue Mar 18 11:21:24 2008 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Mar 18 11:21:31 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> References: <20080210120013.4C3D116A421@hub.freebsd.org> <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> Message-ID: <20080318111805.W17188@fledge.watson.org> On Mon, 17 Mar 2008, Aminuddin Abdullah wrote: > I have just upgraded 5 of my machines to V7 from 6.3 and then realized that > all the machines has a high CPU usage. Almost all of them using 80%-90% CPU > with more than 8000 connections. Using previous 6.3, it only uses 40-50% CPU > with the same kind of connections. > > Using top -S, I can see that swi5: +, PID 17 process is using 30% of CPU > time. What is this process? > > All the machines are Intel C2D 6300 except one which is a AMD 4000+. > > Is this normal for V7? How do I downgrade to 6.3 if this V7 killing the CPU? '+' is used in a swi name to indicate that the names of the interrupts to put in the thread name are too long, and the code looks like it was written under the assumption that at least one name would fit. It sounds like in this case, none fit. We should fix this code, but in the mean time, what you might consider doing is hacking intr_event_update() in kern_intr.c to print out overflowing names to the console using printf(9) so you can at least see what they are. This is the somewhat suspect bit of code: 212 /* 213 * If the handler names were too long, add +'s to indicate missing 214 * names. If we run out of room and still have +'s to add, change 215 * the last character from a + to a *. 216 */ 217 last = &ie->ie_fullname[sizeof(ie->ie_fullname) - 2]; 218 while (missed-- > 0) { 219 if (strlen(ie->ie_fullname) + 1 == sizeof(ie->ie_fullname)) { 220 if (*last == '+') { 221 *last = '*'; 222 break; 223 } else 224 *last = '+'; 225 } else if (space) { 226 strcat(ie->ie_fullname, " +"); 227 space = 0; 228 } else 229 strcat(ie->ie_fullname, "+"); 230 } I've CC'd John, who might have views on what we should do about this. It would be nice if we had a way to export information on all the interrupt event sources, including soft ones, and their mappings to ithreads, including swis, using sysctl. Or maybe we do already and he'll point us at it. :-) Robert N M Watson Computer Laboratory University of Cambridge From rwatson at FreeBSD.org Tue Mar 18 13:04:07 2008 From: rwatson at FreeBSD.org (Robert Watson) Date: Tue Mar 18 13:04:13 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <200803180845.28959.jhb@freebsd.org> References: <20080210120013.4C3D116A421@hub.freebsd.org> <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> <20080318111805.W17188@fledge.watson.org> <200803180845.28959.jhb@freebsd.org> Message-ID: <20080318130241.J17188@fledge.watson.org> On Tue, 18 Mar 2008, John Baldwin wrote: >> '+' is used in a swi name to indicate that the names of the interrupts to >> put in the thread name are too long, and the code looks like it was written >> under the assumption that at least one name would fit. It sounds like in >> this case, none fit. We should fix this code, but in the mean time, what >> you might consider doing is hacking intr_event_update() in kern_intr.c to >> print out overflowing names to the console using printf(9) so you can at >> least see what they are. This is the somewhat suspect bit of code: > > The code is not suspect as p_comm is of fixed length. Someone just used too > long of a name for a swi handler. I was wondering whether we might not do better to put as much in as we can but truncate with a '*', so you at least get a fractional swi name. Under what situations do we use a single ithread for multiple swi's? Robert N M Watson Computer Laboratory University of Cambridge From jhb at freebsd.org Tue Mar 18 13:06:56 2008 From: jhb at freebsd.org (John Baldwin) Date: Tue Mar 18 13:11:56 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <20080318111805.W17188@fledge.watson.org> References: <20080210120013.4C3D116A421@hub.freebsd.org> <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> <20080318111805.W17188@fledge.watson.org> Message-ID: <200803180845.28959.jhb@freebsd.org> On Tuesday 18 March 2008 07:21:23 am Robert Watson wrote: > On Mon, 17 Mar 2008, Aminuddin Abdullah wrote: > > I have just upgraded 5 of my machines to V7 from 6.3 and then realized > > that all the machines has a high CPU usage. Almost all of them using > > 80%-90% CPU with more than 8000 connections. Using previous 6.3, it only > > uses 40-50% CPU with the same kind of connections. > > > > Using top -S, I can see that swi5: +, PID 17 process is using 30% of CPU > > time. What is this process? > > > > All the machines are Intel C2D 6300 except one which is a AMD 4000+. > > > > Is this normal for V7? How do I downgrade to 6.3 if this V7 killing the > > CPU? > > '+' is used in a swi name to indicate that the names of the interrupts to > put in the thread name are too long, and the code looks like it was written > under the assumption that at least one name would fit. It sounds like in > this case, none fit. We should fix this code, but in the mean time, what > you might consider doing is hacking intr_event_update() in kern_intr.c to > print out overflowing names to the console using printf(9) so you can at > least see what they are. This is the somewhat suspect bit of code: The code is not suspect as p_comm is of fixed length. Someone just used too long of a name for a swi handler. > I've CC'd John, who might have views on what we should do about this. It > would be nice if we had a way to export information on all the interrupt > event sources, including soft ones, and their mappings to ithreads, > including swis, using sysctl. Or maybe we do already and he'll point us at > it. :-) We don't and that is what we need for a userland interrupt binding interface to make sense. -- John Baldwin From jhb at freebsd.org Tue Mar 18 13:59:16 2008 From: jhb at freebsd.org (John Baldwin) Date: Tue Mar 18 14:16:08 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <20080318130241.J17188@fledge.watson.org> References: <20080210120013.4C3D116A421@hub.freebsd.org> <200803180845.28959.jhb@freebsd.org> <20080318130241.J17188@fledge.watson.org> Message-ID: <200803180923.13032.jhb@freebsd.org> On Tuesday 18 March 2008 09:04:05 am Robert Watson wrote: > On Tue, 18 Mar 2008, John Baldwin wrote: > >> '+' is used in a swi name to indicate that the names of the interrupts > >> to put in the thread name are too long, and the code looks like it was > >> written under the assumption that at least one name would fit. It > >> sounds like in this case, none fit. We should fix this code, but in the > >> mean time, what you might consider doing is hacking intr_event_update() > >> in kern_intr.c to print out overflowing names to the console using > >> printf(9) so you can at least see what they are. This is the somewhat > >> suspect bit of code: > > > > The code is not suspect as p_comm is of fixed length. Someone just used > > too long of a name for a swi handler. > > I was wondering whether we might not do better to put as much in as we can > but truncate with a '*', so you at least get a fractional swi name. Under > what situations do we use a single ithread for multiple swi's? The softclock one gets overloaded with some tty handlers. This code is also just generic ithread code common to swi's and hardware interrupts. -- John Baldwin From ivoras at freebsd.org Thu Mar 20 17:11:25 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Mar 20 17:11:32 2008 Subject: V7 High CPU Usage on swi5:+, what is this process? In-Reply-To: <20080318111805.W17188@fledge.watson.org> References: <20080210120013.4C3D116A421@hub.freebsd.org> <47de32b3.1bbc720a.7cf0.ffff8ff1@mx.google.com> <20080318111805.W17188@fledge.watson.org> Message-ID: Robert Watson wrote: > I've CC'd John, who might have views on what we should do about this. > It would be nice if we had a way to export information on all the > interrupt event sources, including soft ones, and their mappings to > ithreads, including swis, using sysctl. Or maybe we do already and > he'll point us at it. :-) How about, for starts, the truncating loop gets overriden / replaced for ithread process names? From engywook at gmail.com Sun Mar 23 22:48:36 2008 From: engywook at gmail.com (Daniel Andersson) Date: Sun Mar 23 22:58:09 2008 Subject: Tuning: 100mbit faster, gbit slower. Message-ID: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> Hey! I was trying to milk the most out of my 100/100. What I ended up with was something, to me, quite odd. When I hadn't done anything I could ftp things from my server box at 50mb/s and run rtorrent at about 9-10 mb/s at most. After my "tuning" I can only ftp at a very "choppy" 30-40mb/s, but rtorrent runs at about 11mb/s. This is what I did: kern.ipc.maxsockbuf=16777216 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 according to http://dsd.lbl.gov/TCP-tuning/FreeBSD.html every other setting there was default I believe. I also set these: net.inet.tcp.recvspace: 262144 net.inet.tcp.sendspace: 262144 dmesg: http://pastebin.org/24780 Am I just imagining that rtorrent runs faster? Can't ftp handle high buffers or did I mess something up? Is there something else I could do to make it faster? Setting up polling perhaps? Cheers, Daniel Andersson From josh at tcbug.org Mon Mar 24 08:44:40 2008 From: josh at tcbug.org (Josh Paetzel) Date: Mon Mar 24 08:44:46 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> Message-ID: <200803240327.01211.josh@tcbug.org> On Sunday 23 March 2008 05:21:48 pm Daniel Andersson wrote: > Hey! > > I was trying to milk the most out of my 100/100. What I > ended up with was something, to me, quite odd. When I > hadn't done anything I could ftp things from my server > box at 50mb/s and run rtorrent at about 9-10 mb/s at most. > After my "tuning" I can only ftp at a very "choppy" > 30-40mb/s, but rtorrent runs at about 11mb/s. > This is what I did: > > kern.ipc.maxsockbuf=16777216 > net.inet.tcp.sendbuf_max=16777216 > net.inet.tcp.recvbuf_max=16777216 > > according to > http://dsd.lbl.gov/TCP-tuning/FreeBSD.html > every other setting there was default > I believe. > > I also set these: > > net.inet.tcp.recvspace: 262144 > net.inet.tcp.sendspace: 262144 > > dmesg: > http://pastebin.org/24780 > > Am I just imagining that rtorrent runs faster? > Can't ftp handle high buffers or did I mess > something up? Is there something else I > could do to make it faster? Setting up > polling perhaps? > > Cheers, > Daniel Andersson The stock settings are more than enough to saturate 100TX with even relatively ancient hardware. And by ancient I mean Pentium 2 class machines. The biggest tuning you can do is use intel (fxp) or 3com (xl) NICS and a halfway decent switch. If your server box can't saturate 100TX ethernet with the defaults then something is amiss. Perhaps provide a dmesg from the server and a client and a tcpdump from an FTP session between them? -- Thanks, Josh Paetzel PGP: 8A48 EF36 5E9F 4EDA 5A8C 11B4 26F9 01F1 27AF AECB -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part. Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080324/45d9f3cf/attachment.pgp From stefan.lambrev at moneybookers.com Mon Mar 24 10:54:39 2008 From: stefan.lambrev at moneybookers.com (Stefan Lambrev) Date: Mon Mar 24 10:54:41 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> Message-ID: <47E7885D.6080507@moneybookers.com> Daniel Andersson wrote: > Hey! > > I was trying to milk the most out of my 100/100. What I > ended up with was something, to me, quite odd. When I > hadn't done anything I could ftp things from my server > box at 50mb/s and run rtorrent at about 9-10 mb/s at most. > After my "tuning" I can only ftp at a very "choppy" > 30-40mb/s, but rtorrent runs at about 11mb/s. > Are you sure the problem is in the network ? Sounds like the bottleneck is your HDD. You can run netperf to check this :) > This is what I did: > > kern.ipc.maxsockbuf=16777216 > net.inet.tcp.sendbuf_max=16777216 > net.inet.tcp.recvbuf_max=16777216 > > according to > http://dsd.lbl.gov/TCP-tuning/FreeBSD.html > every other setting there was default > I believe. > > I also set these: > > net.inet.tcp.recvspace: 262144 > net.inet.tcp.sendspace: 262144 > > dmesg: > http://pastebin.org/24780 > > Am I just imagining that rtorrent runs faster? > Can't ftp handle high buffers or did I mess > something up? Is there something else I > could do to make it faster? Setting up > polling perhaps? > > Cheers, > Daniel Andersson > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" > From j_guojun at lbl.gov Mon Mar 24 05:40:38 2008 From: j_guojun at lbl.gov (Jin Guojun [VFFS]) Date: Mon Mar 24 11:13:22 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> Message-ID: <47E73891.8000803@lbl.gov> You cannot do blind tuning according those numbers. They are for 1-10Gbps pipe. The proper number is "pipe diamter" x "pipe length" = capacity In your case, the maximum number = 100Mbps x "delay from your machine to server" -Jin Daniel Andersson wrote: >Hey! > >I was trying to milk the most out of my 100/100. What I >ended up with was something, to me, quite odd. When I >hadn't done anything I could ftp things from my server >box at 50mb/s and run rtorrent at about 9-10 mb/s at most. >After my "tuning" I can only ftp at a very "choppy" >30-40mb/s, but rtorrent runs at about 11mb/s. >This is what I did: > >kern.ipc.maxsockbuf=16777216 >net.inet.tcp.sendbuf_max=16777216 >net.inet.tcp.recvbuf_max=16777216 > >according to >http://dsd.lbl.gov/TCP-tuning/FreeBSD.html >every other setting there was default >I believe. > >I also set these: > >net.inet.tcp.recvspace: 262144 >net.inet.tcp.sendspace: 262144 > >dmesg: >http://pastebin.org/24780 > >Am I just imagining that rtorrent runs faster? >Can't ftp handle high buffers or did I mess >something up? Is there something else I >could do to make it faster? Setting up >polling perhaps? > >Cheers, >Daniel Andersson > > From engywook at gmail.com Mon Mar 24 11:22:42 2008 From: engywook at gmail.com (Daniel Andersson) Date: Mon Mar 24 11:36:37 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <200803240327.01211.josh@tcbug.org> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> <200803240327.01211.josh@tcbug.org> Message-ID: <24adbbc00803240422m5b04b485s5df2f406aa89dc2b@mail.gmail.com> Thanks for the reply! Maybe I should have been more clear. My setup looks like this: internet - em1(server box) em0 - windoze desktop. The internet(em1) 100mbit seems to do fine, even better than before, I get about 11mb/s with rtorrent(uploading). It's the internal gbit connection that's weird with ftp. It not as fast nor as smooth as it was before I did the "tuning". I doesn't have any trouble running ftping at 30mb/s after the tuning so it is definately capabel of delivering 100mbit? $ifconfig -a em0: flags=8843 metric 0 mtu 1500 options=9b ether 00:1b:21:0a:1d:87 inet 192.168.0.10 netmask 0xffffff00 broadcast 192.168.0.255 media: Ethernet autoselect (1000baseTX ) status: active em1: flags=8843 metric 0 mtu 1500 options=9b ether 00:1b:21:0c:d1:b3 inet external.ip.goes.here netmask 0xfffffc00 broadcast xx.xxx.147.255 media: Ethernet autoselect (100baseTX ) status: active plip0: flags=108810 metric 0 mtu 1500 lo0: flags=8049 metric 0 mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 pflog0: flags=141 metric 0 mtu 33204 tcpdump as requested: engy# tcpdump -vv -i em0 portrange 40000-42000 or portrange 20-21 | cat > /usr/home/engy/tcpdump http://pastebin.org/25081 If you had something else in mind, let me know. When I transferred large files I sometimes got about 600 packages dropped by kernel. That can't be good. Also it seemed that I got a lot of ack packages, don't know if that's normal. Cheers, Daniel >The stock settings are more than enough to saturate 100TX with even > relatively > >ancient hardware. And by ancient I mean Pentium 2 class machines. > > > >The biggest tuning you can do is use intel (fxp) or 3com (xl) NICS and a > >halfway decent switch. > > > >If your server box can't saturate 100TX ethernet with the defaults then > >something is amiss. Perhaps provide a dmesg from the server and a client > and > >a tcpdump from an FTP session between them? > > > >-- > >Thanks, > > > >Josh Paetzel > > >PGP: 8A48 EF36 5E9F 4EDA 5A8C 11B4 26F9 01F1 27AF AECB > > From engywook at gmail.com Mon Mar 24 11:43:52 2008 From: engywook at gmail.com (Daniel Andersson) Date: Mon Mar 24 11:51:34 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <47E7885D.6080507@moneybookers.com> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> <47E7885D.6080507@moneybookers.com> Message-ID: <24adbbc00803240443p3fffc741tb80dfda257eb29f@mail.gmail.com> Bloody hell! You were right. I transferred a big file to the system disk and ftped it from there to my desktop and it topped at 59mb/s, Sorry guys! It's quite odd though. It's my newest disk: ad10: 476940MB at ata5-master SATA150 It's connected to: atapci2: port 0xd200-0xd27f,0xd300-0xd3ff mem 0xee0c0000-0xee0c0fff,0xee080000-0xee09ffff irq 18 at device 11.0 on pci2 atapci2: [ITHREAD]atapci2: [ITHREAD] I recall having trouble installing on that controller, it would just reboot. I just checked the 7.0 Hardware notes and can't find it there. That sucks. Guess I'll have to buy a PCI controller then. Any good ones you can recommend? Cheers, Daniel >Are you sure the problem is in the network ? > >Sounds like the bottleneck is your HDD. > >You can run netperf to check this :) > > From engywook at gmail.com Mon Mar 24 12:08:19 2008 From: engywook at gmail.com (Daniel Andersson) Date: Mon Mar 24 12:13:52 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> Message-ID: <24adbbc00803240508q224003e1j77f586d4ef6a8bf@mail.gmail.com> I might have spoken too soon. I ran the transfer a few times and got this once: 215653 packets captured 379235 packets received by filter 97650 packets dropped by kernel The transfer wouldn't even start. Buggy windoze client perhaps? Other times it would jump anywhere from 26-61mb/s. That could be disk bottleneck I guess. From archwndas at yahoo.com Mon Mar 24 14:59:43 2008 From: archwndas at yahoo.com (Simeon Nifos) Date: Mon Mar 24 14:59:48 2008 Subject: run-time performance of regression of sparse matrix vector multiplication Message-ID: <147812.96521.qm@web56509.mail.re3.yahoo.com> I have found a problem with FreeBSD AMD64 (maybe i386 too). Performance decrease related to Linux. I am attaching the results and the piece of code I used. You have to install g++42 on FreeBSD first. here are the results of the benchmark: =============== ==== LINUX ==== =============== Intel Core 2 ============ number of threads: 1/ 2 Sun CC create : 808/443 multiply: 5063/4488 g++-4.2.2 create : 881/479 multiply: 5245/4691 intel icpc create : 724/404 multiply: 4903/4594 we see that although the allocation of can be safely parallelized the multiplication has a really hard time to do so. Are there any problems with this approach I cannot see? sysctl dev.cpu.0.freq [archwn@home /usr/home/archwn/sparsematrixvector]$ sysctl dev.cpu.0.freq dev.cpu.0.freq: 1654 ===================== ==== FreeBSD 7.0 ==== ===================== Intel Core 2 ============ number of threads: 1/ 2 g++-4.2.2 create : 1750/1288 multiply: 7098/5271 Same optimization flags in both cases with g++-4.2.2. I have also written a pthreads version of the above code which doesn't need OpenMP capable compiler at all. This allows us to try gcc-3.4.6 compiler which is unlikely to have problems of its own. Is there anything you would like me to try out? Is anybody interested in having the code in order to perform his own tests? Thanks in advance, Archwn. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From josh at tcbug.org Mon Mar 24 15:41:06 2008 From: josh at tcbug.org (Josh Paetzel) Date: Mon Mar 24 15:41:10 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <24adbbc00803240422m5b04b485s5df2f406aa89dc2b@mail.gmail.com> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> <200803240327.01211.josh@tcbug.org> <24adbbc00803240422m5b04b485s5df2f406aa89dc2b@mail.gmail.com> Message-ID: <200803241040.53346.josh@tcbug.org> On Monday 24 March 2008 06:22:12 am Daniel Andersson wrote: > Thanks for the reply! Maybe I should have been more clear. > My setup looks like this: > internet - em1(server box) em0 - windoze desktop. > > The internet(em1) 100mbit seems to do fine, even better than > before, I get about 11mb/s with rtorrent(uploading). It's the > internal gbit connection that's weird with ftp. It not as fast nor > as smooth as it was before I did the "tuning". I doesn't have > any trouble running ftping at 30mb/s after the tuning so it is > definately capabel of delivering 100mbit? > I think we are having a terminology problem here. Are you meaning 30mb/s as in 30 megabits per second (30% of 100TX speed) or do you mean 30 MB/s as in 30 Megabytes/sec? I think in rereading you post you are meaning Megabytes but using the terminology for megabits. So here's what I've done to nearly saturate gig-e. Keep in mind that I have 15k SAS drives and intel gig-e adapters that aren't sitting in 33mhz 32bit PCI slots, single IDE/SATA drives are going to be a bottleneck as are 33mhz 32bit PCI NICs. This is on RELENG_6_3 net.inet.tcp.sendspace=262144 net.inet.tcp.recvspace=262144 kern.ipc.maxsockbuf=1048576 ifconfig em0 mtu 9014 (You'll need a switch that supports jumbo frames to do this) iperf shows wire traffic around 969 mbps and FTP runs at 110 Megs/sec scp/sftp appears to be cpu bound at 45 Megs/sec, and NFS with TCP mounts and send/receive packets set to 16384 manages about 90 Megs/sec. -- Thanks, Josh Paetzel PGP: 8A48 EF36 5E9F 4EDA 5A8C 11B4 26F9 01F1 27AF AECB -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part. Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080324/72e61356/attachment.pgp From engywook at gmail.com Mon Mar 24 15:59:56 2008 From: engywook at gmail.com (Daniel Andersson) Date: Mon Mar 24 16:15:10 2008 Subject: Tuning: 100mbit faster, gbit slower. In-Reply-To: <200803241040.53346.josh@tcbug.org> References: <24adbbc00803231521h78844f26q77c48573f82408b9@mail.gmail.com> <200803240327.01211.josh@tcbug.org> <24adbbc00803240422m5b04b485s5df2f406aa89dc2b@mail.gmail.com> <200803241040.53346.josh@tcbug.org> Message-ID: <24adbbc00803240859w1e1cc372xae8c97a29d474b92@mail.gmail.com> > >I think we are having a terminology problem here. Are you meaning 30mb/s > as > >in 30 megabits per second (30% of 100TX speed) or do you mean 30 MB/s as > in > >30 Megabytes/sec? I think in rereading you post you are meaning > Megabytes > >but using the terminology for megabits. Ah, yes, sorry! I ment 30MB/s. >So here's what I've done to nearly saturate gig-e. Keep in mind that I > have > >15k SAS drives and intel gig-e adapters that aren't sitting in 33mhz > 32bit > >PCI slots, single IDE/SATA drives are going to be a bottleneck as are > 33mhz > >32bit PCI NICs. > > > >This is on RELENG_6_3 > > > > > >net.inet.tcp.sendspace=262144 > >net.inet.tcp.recvspace=262144 > > > >kern.ipc.maxsockbuf=1048576 > >ifconfig em0 mtu 9014 (You'll need a switch that supports jumbo frames to > do > >this) > > > >iperf shows wire traffic around 969 mbps and FTP runs at 110 Megs/sec > >scp/sftp appears to be cpu bound at 45 Megs/sec, and NFS with TCP mounts > and > >send/receive packets set to 16384 manages about 90 Megs/sec. > >-- > >Thanks, > >Josh Paetzel > >PGP: 8A48 EF36 5E9F 4EDA 5A8C 11B4 26F9 01F1 27AF AECB > > I don't really need the extra speed on the internal network. I just want the steady 50MB/s back, but also the increased performance of rtorrent I got from my "tuning". I'm also planning on getting more disks and setting up either zfs or softraid to lessen the hdd bottleneck. One thing at a time though. Maybe setting kern.ipc.maxsockbuf=1M+ and leaving everything else default will solve it. Since rtorrent has it's own setting for buffer sizes. Thanks for the replies and sorry for the mixup! Daniel From bmeekhof at umich.edu Tue Mar 25 02:27:23 2008 From: bmeekhof at umich.edu (Benjeman J. Meekhof) Date: Tue Mar 25 02:27:29 2008 Subject: performance tuning on perc6 (LSI) controller Message-ID: <47E85C00.4010601@umich.edu> Hello, I think this might be useful information, and am also hoping for a little input. We've been doing some FreeBSD benchmarking on Dell PE2950 systems with Perc6 controllers (dual-quad Xeon, 16GB, Perc6=LSI card, mfi driver, 7.0-RELEASE). There are two controllers in each system, and each has two MD1000 disk shelves attached via the 2 4x SAS interfaces. (so 30PD available to each controller, 60 PD on the system). My baseline was this - on linux 2.6.20 we're doing 800MB/s write and greater read with this configuration: 2 raid6 volumes volumes striped into a raid0 volume using linux software raid, XFS filesystem. Each raid6 is a volume on one controller using 30 PD. We've spent time tuning this, more than I have with FreeBSD so far. Initially I was getting strangely poor read results. Here is one example (before launching into quicker dd tests, i already had similarly bad results from some more complete iozone tests): time dd if=/dev/zero of=/test/deletafile bs=1M count=10240 10737418240 bytes transferred in 26.473629 secs (405589209 bytes/sec) time dd if=/test/deletafile of=/dev/null bs=1M count=10240 10737418240 bytes transferred in 157.700367 secs (68087465 bytes/sec) To make a very long story short, much better results achieved in the end by simply by increasing the filesystem blocksize to the maximum (same dd commands). I'm running a more thorough test on this setup using iozone: #gstripe label -v -s 128k test /dev/mfid0 /dev/mfid2 #newfs -U -b 65536 /dev/stripe/test #write: 19.240875 secs (558052492 bytes/sec) #read: 20.000606 secs (536854644 bytes/sec) Also did this in /boot/loader.conf - it effected nothing very much in any test but the settings seemed reasonable so I kept them: kern.geom.stripe.fast=1 vfs.hirunningspace=5242880 vfs.read_max=32 Any other suggestions to get best throughput? There is also HW RAID stripe size to adjust larger or smaller. ZFS is also on the list for testing. Should I perhaps be running -CURRENT or -STABLE to be get best results with ZFS? -Ben -- Benjeman Meekhof - UM ATLAS/AGLT2 Computing bmeekhof@umich.edu From bmeekhof at umich.edu Tue Mar 25 02:27:24 2008 From: bmeekhof at umich.edu (Benjeman J. Meekhof) Date: Tue Mar 25 02:27:29 2008 Subject: performance tuning on perc6 (LSI) controller Message-ID: <47E85D54.4070905@umich.edu> Should clarify that the first test mentioned below used the same gstripe setup as the latter one but did not specify any newfs blocksize: #gstripe label -v -s 128k test /dev/mfid0 /dev/mfid2 #newfs -U /dev/stripe/test sorry, Ben ---------------------------------------- Hello, I think this might be useful information, and am also hoping for a little input. We've been doing some FreeBSD benchmarking on Dell PE2950 systems with Perc6 controllers (dual-quad Xeon, 16GB, Perc6=LSI card, mfi driver, 7.0-RELEASE). There are two controllers in each system, and each has two MD1000 disk shelves attached via the 2 4x SAS interfaces. (so 30PD available to each controller, 60 PD on the system). My baseline was this - on linux 2.6.20 we're doing 800MB/s write and greater read with this configuration: 2 raid6 volumes volumes striped into a raid0 volume using linux software raid, XFS filesystem. Each raid6 is a volume on one controller using 30 PD. We've spent time tuning this, more than I have with FreeBSD so far. Initially I was getting strangely poor read results. Here is one example (before launching into quicker dd tests, i already had similarly bad results from some more complete iozone tests): time dd if=/dev/zero of=/test/deletafile bs=1M count=10240 10737418240 bytes transferred in 26.473629 secs (405589209 bytes/sec) time dd if=/test/deletafile of=/dev/null bs=1M count=10240 10737418240 bytes transferred in 157.700367 secs (68087465 bytes/sec) To make a very long story short, much better results achieved in the end by simply by increasing the filesystem blocksize to the maximum (same dd commands). I'm running a more thorough test on this setup using iozone: #gstripe label -v -s 128k test /dev/mfid0 /dev/mfid2 #newfs -U -b 65536 /dev/stripe/test #write: 19.240875 secs (558052492 bytes/sec) #read: 20.000606 secs (536854644 bytes/sec) Also did this in /boot/loader.conf - it effected nothing very much in any test but the settings seemed reasonable so I kept them: kern.geom.stripe.fast=1 vfs.hirunningspace=5242880 vfs.read_max=32 Any other suggestions to get best throughput? There is also HW RAID stripe size to adjust larger or smaller. ZFS is also on the list for testing. Should I perhaps be running -CURRENT or -STABLE to be get best results with ZFS? -Ben -- Benjeman Meekhof - UM ATLAS/AGLT2 Computing bmeekhof@umich.edu From ivoras at freebsd.org Tue Mar 25 09:52:47 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Tue Mar 25 09:52:53 2008 Subject: performance tuning on perc6 (LSI) controller In-Reply-To: <47E85C00.4010601@umich.edu> References: <47E85C00.4010601@umich.edu> Message-ID: Benjeman J. Meekhof wrote: > My baseline was this - on linux 2.6.20 we're doing 800MB/s write and > greater read with this configuration: 2 raid6 volumes volumes striped > into a raid0 volume using linux software raid, XFS filesystem. Each > raid6 is a volume on one controller using 30 PD. We've spent time > tuning this, more than I have with FreeBSD so far. > time dd if=/dev/zero of=/test/deletafile bs=1M count=10240 > 10737418240 bytes transferred in 26.473629 secs (405589209 bytes/sec) > time dd if=/test/deletafile of=/dev/null bs=1M count=10240 > 10737418240 bytes transferred in 157.700367 secs (68087465 bytes/sec) I had similar ratio of results when comparing FreeBSD+UFS to most high-performance Linux file systems (XFS is really great!), so I'd guess it's about as fast as you can get with this combination. > Any other suggestions to get best throughput? There is also HW RAID > stripe size to adjust larger or smaller. ZFS is also on the list for > testing. Should I perhaps be running -CURRENT or -STABLE to be get best > results with ZFS? ZFS will be up to 50% faster on tests such as yours, so you should definitely try it. Unfortunately it's not stable and you probably don't want to use it in production. AFAIK there are no significant differences between ZFS in -current and -stable. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080325/d8985358/signature.pgp From bmeekhof at umich.edu Wed Mar 26 06:00:05 2008 From: bmeekhof at umich.edu (Benjeman J. Meekhof) Date: Wed Mar 26 06:00:10 2008 Subject: performance tuning on perc6 (LSI) controller In-Reply-To: References: <47E85C00.4010601@umich.edu> Message-ID: <47E9E660.6090101@umich.edu> Hi Ivan, Thanks for the response. Your response quotes my initial uneven results, but are you also implying that I most likely cannot achieve results better than the later results which use a larger filesystem blocksize? gstripe label -v -s 128k test /dev/mfid0 /dev/mfid2 #newfs -U -b 65536 /dev/stripe/test #write: 19.240875 secs (558052492 bytes/sec) #read: 20.000606 secs (536854644 bytes/sec) (iozone showed reasonably similar results - depending on recordsize would mostly be writing/reading around 500MB/s, though lows of 300MB/s were recorded in some read situations). I suppose my real question is whether there is some inherent limit in UFS2 or FreeBSD or geom that would prevent going higher than this. Maybe that's really not possible to answer, but certainly I plan to explore a few more configurations. Most of my tuning so far has been trial and error to get to this point, and all I ended up doing to finally get good results was changing filesystem blocksize to the max possible (I wanted to go to 128k but it doesn't let you do that). Apparently UFS2 and/or geom interact differently with the controller than Linux/XFS. This is no great surprise. thanks, Ben Ivan Voras wrote: > Benjeman J. Meekhof wrote: > >> My baseline was this - on linux 2.6.20 we're doing 800MB/s write and >> greater read with this configuration: 2 raid6 volumes volumes striped >> into a raid0 volume using linux software raid, XFS filesystem. Each >> raid6 is a volume on one controller using 30 PD. We've spent time >> tuning this, more than I have with FreeBSD so far. > >> time dd if=/dev/zero of=/test/deletafile bs=1M count=10240 >> 10737418240 bytes transferred in 26.473629 secs (405589209 bytes/sec) >> time dd if=/test/deletafile of=/dev/null bs=1M count=10240 >> 10737418240 bytes transferred in 157.700367 secs (68087465 bytes/sec) > > I had similar ratio of results when comparing FreeBSD+UFS to most > high-performance Linux file systems (XFS is really great!), so I'd guess > it's about as fast as you can get with this combination. > >> Any other suggestions to get best throughput? There is also HW RAID >> stripe size to adjust larger or smaller. ZFS is also on the list for >> testing. Should I perhaps be running -CURRENT or -STABLE to be get best >> results with ZFS? > > ZFS will be up to 50% faster on tests such as yours, so you should > definitely try it. Unfortunately it's not stable and you probably don't > want to use it in production. AFAIK there are no significant differences > between ZFS in -current and -stable. > > > -- Benjeman Meekhof - UM ATLAS/AGLT2 Computing office: 734-764-3450 cell: 734-417-6312 From ivoras at freebsd.org Wed Mar 26 09:51:33 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Wed Mar 26 09:51:38 2008 Subject: performance tuning on perc6 (LSI) controller In-Reply-To: <47E9E660.6090101@umich.edu> References: <47E85C00.4010601@umich.edu> <47E9E660.6090101@umich.edu> Message-ID: <9bbcef730803260251g43583da6pff1c291db0cc9246@mail.gmail.com> On 26/03/2008, Benjeman J. Meekhof wrote: > Hi Ivan, > > Thanks for the response. Your response quotes my initial uneven > results, but are you also implying that I most likely cannot achieve > results better than the later results which use a larger filesystem > blocksize? > > gstripe label -v -s 128k test /dev/mfid0 /dev/mfid2 > #newfs -U -b 65536 /dev/stripe/test > #write: 19.240875 secs (558052492 bytes/sec) > #read: 20.000606 secs (536854644 bytes/sec) > > (iozone showed reasonably similar results - depending on recordsize > would mostly be writing/reading around 500MB/s, though lows of 300MB/s > were recorded in some read situations). Yes, that was my meaning. If I understood you correctly, Linux manages ~~ 800 MB/s on the array, right? > I suppose my real question is whether there is some inherent limit in > UFS2 or FreeBSD or geom that would prevent going higher than this. > Maybe that's really not possible to answer, but certainly I plan to > explore a few more configurations. I'd guess it's UFS(2), but I don't really know. My own benchmarking was on a different controller (IBM ServeRAID 8) and I got a similar ratio between Linux and FreeBSD, so I don't think it's the drivers' fault. ZFS achieves noticeably better results so it's probably not GEOM's. From david at catwhisker.org Wed Mar 26 20:45:00 2008 From: david at catwhisker.org (David Wolfskill) Date: Wed Mar 26 20:50:08 2008 Subject: Are sysctl(8) values useful for measuring system resource consumption? Message-ID: <20080326203158.GA6302@bunrab.catwhisker.org> At ${work}, one of my projects is to help obtain information regarding the "developer experience," what resources are thus consumed, and figure out ways to mitigate the pain -- the objective, of course, being to help the developers be more productive within a FreeBSD environment. A couple of the perceived "pain points" are the time it takes to perform a CVS checkout and the time it takes to perform a build of the system (ours, at work -- not FreeBSD itself). As a step toward obtaining the information, I've cobbled up a Perl script that essentailly acts as a bit of "scaffolding" around time(1); the script sets things up to invoke time(1) with the "-l" flag (so we get the rusage structure information) and use "-o" to direct the output of time(1) to a file in /tmp, which the script then reads. The script then spits out a bunch of information as a single record in a CSV (Comma-Separated Variable) file (as that's the format my colleague wanted): start- and stop-timestamps, the hostname where the processes ran, the current working directory, real- and effective UIDs & GIDs; the exit code for the invoked command, the output from time(1), and (finally) the invoked command itself. (I then use a different script to read the CSV and update an RRD, then use rrdcgi(1) to generate graphs.) This has proved to be interesting, and quite possibly useful, but it merely provides a view as to the resources used by the processes being invoked from within the "scaffolding." I believe we would be well-served by also collecting information as to the resources being consumed by the system as a whole, as well -- for instance, if there's a lot of other activity on the machine in question, it might be nice to know that (and it might be even better if we had a way to characterize the rest of the workload as a whole). It would be handy if I could arrange to run vmstat(8), iostat(8), or netstat(1) in such a way that I got counters for the values immediately prior to starting the command being tested, then got a similar set of counters just after the test command completed, so I could store the "counter differences" some place handy. But that doesn't seem too readily feasible at this time. I had been trying to think of a decent way to get the overall system information for precisely the interval that I'm running the test, and then the thought occurred to me that perhaps I could invoke sysctl(8) with a suitable set of arguments both before & after invoking the process being tested; perhaps that would be a reasonable way to get information of the desired quality. I do not expect to necessarily be able to install random ports on the development machines, so there's a significant benefit to using an approach that doesn't require doing that. (I can use scripts that I write, as they are being invoked by the "test" user, from that test user's environment.) So this is a reality check: does that approach make sense? If not, what shortcomings does it have with respect to other alternatives? Please recall that the intent is to be able to place the rusage data from time(1) in a relevant context. I'm reasonably open to suggestions & alternatives. Thanks! Peace, david -- David H. Wolfskill david@catwhisker.org I submit that "conspiracy" would be an appropriate collective noun for cats. See http://www.catwhisker.org/~david/publickey.gpg for my public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20080326/5609ebbc/attachment.pgp From ap00 at mail.ru Fri Mar 28 10:31:56 2008 From: ap00 at mail.ru (Anthony Pankov) Date: Fri Mar 28 10:32:01 2008 Subject: packet delay because of blackhole Message-ID: <1333421734.20080328201458@mail.ru> Just for somebody convince. While analyzing client<->server HTTPS conversation one second delay in packet exchange was discovered (strongly reproducible): Sample: N time 6 0.002303 10.28.4.14 10.28.4.50 SSL Client Hello 7 0.106710 10.28.4.50 10.28.4.14 TCP 443 > 1447 [ACK] Seq=1 Ack=103 Win=65535 Len=0 8 1.045712 10.28.4.50 10.28.4.14 TLSv1 Server Hello, Certificate, Server Hello Done Another sample: 10 0.011722 10.28.4.14 10.28.4.50 TLSv1 Application Data 11 0.115933 10.28.4.50 10.28.4.14 TCP 443 > 1442 [ACK] Seq=839 Ack=519 Win=65466 Len=0 12 1.054037 10.28.4.50 10.28.4.14 TLSv1 Application Data The reason for delay is sysctl tcp.blackhole value grater than 0, much to surprise. So, turning tcp.blackhole to 0 eliminate any delay (strongly reproducible). System: FreeBSD 6_2_stable -- Best regards, Anthony mailto:ap00@mail.ru