From tt-list at simplenet.com Tue Aug 12 21:39:32 2008 From: tt-list at simplenet.com (Tim Traver) Date: Tue Aug 12 21:39:39 2008 Subject: 7.0 CPU and Memory Performance Message-ID: <48A1F379.2040805@simplenet.com> Hi All, I have recently had the opportunity to upgrade a few servers from old versions of 5.4 to 7.0, and have seen some interesting data. Before doing this, I wanted to take some benchmarks to see how the scripts that I would run would fare between the two versions, and the results are somewhat confusing... I tried to get as many ducks in a row before posting this, cause i don't want to waste any of the developers precious time, but I can't guarantee that my methods were not flawed. For simplicity, I used a port called ubench (the latest version 0.3, which I know is quite old) to get the following numbers : Since I was doing this on the same machine, with completely different builds (not simply a compile upgrade, but a full install), I figure it doesn't really matter what kind of machine it is, but just for grins, it is a Dual Opteron with 2GB of memory in it, compiled with the i386 confs. The 7.0 is compiled with the ULE scheduler... The following are averages of at least 5 runs : FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 Now, I really don't know exactly what the ubench program is doing, but I think the description says that it is doing random integer and floating point operations for the CPU tests, and random memory allocation and copying for the memory test. So, can we explain the difference???? It looks like the latest SMP code allows it to process more operations, but what happened to the memory operations???? Just to get an idea of what this was going to do to my scripts, I tried some benchmarks for those as well. I tried to run a PHP script using php 4.4.7 and got the following results : Using "time php index.php" to get the real time : FreeBSD 5.4 - 0.290 seconds FreeBSD 7.0 - 0.335 seconds So, do the slower memory operations cause that difference in the real time it takes to run that script??? Thanks, Tim. From alfred at freebsd.org Wed Aug 13 06:45:54 2008 From: alfred at freebsd.org (Alfred Perlstein) Date: Wed Aug 13 06:46:02 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A1F379.2040805@simplenet.com> References: <48A1F379.2040805@simplenet.com> Message-ID: <20080813062731.GZ16977@elvis.mu.org> Hey Tim, please try a later version of FreeBSD 7, there's been many improvements in the malloc(3) code since 7.0 so these results aren't very meaningful. Can you let us know what you see with 7-stable? thanks, -Alfred * Tim Traver [080812 14:39] wrote: > Hi All, > > I have recently had the opportunity to upgrade a few servers from old > versions of 5.4 to 7.0, and have seen some interesting data. Before > doing this, I wanted to take some benchmarks to see how the scripts that > I would run would fare between the two versions, and the results are > somewhat confusing... > > I tried to get as many ducks in a row before posting this, cause i don't > want to waste any of the developers precious time, but I can't guarantee > that my methods were not flawed. > > For simplicity, I used a port called ubench (the latest version 0.3, > which I know is quite old) to get the following numbers : > > Since I was doing this on the same machine, with completely different > builds (not simply a compile upgrade, but a full install), I figure it > doesn't really matter what kind of machine it is, but just for grins, it > is a Dual Opteron with 2GB of memory in it, compiled with the i386 confs. > > The 7.0 is compiled with the ULE scheduler... > > The following are averages of at least 5 runs : > > FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 > FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 > > Now, I really don't know exactly what the ubench program is doing, but I > think the description says that it is doing random integer and floating > point operations for the CPU tests, and random memory allocation and > copying for the memory test. > > So, can we explain the difference???? It looks like the latest SMP code > allows it to process more operations, but what happened to the memory > operations???? > > Just to get an idea of what this was going to do to my scripts, I tried > some benchmarks for those as well. > > I tried to run a PHP script using php 4.4.7 and got the following results : > > Using "time php index.php" to get the real time : > > FreeBSD 5.4 - 0.290 seconds > FreeBSD 7.0 - 0.335 seconds > > So, do the slower memory operations cause that difference in the real > time it takes to run that script??? > > Thanks, > > Tim. > > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to > "freebsd-performance-unsubscribe@freebsd.org" -- - Alfred Perlstein From rwatson at FreeBSD.org Wed Aug 13 08:41:26 2008 From: rwatson at FreeBSD.org (Robert Watson) Date: Wed Aug 13 08:41:32 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A1F379.2040805@simplenet.com> References: <48A1F379.2040805@simplenet.com> Message-ID: On Tue, 12 Aug 2008, Tim Traver wrote: > I have recently had the opportunity to upgrade a few servers from old > versions of 5.4 to 7.0, and have seen some interesting data. Before doing > this, I wanted to take some benchmarks to see how the scripts that I would > run would fare between the two versions, and the results are somewhat > confusing... There are potentially a lot of variables here, you migh want to try fiddling with the following and see what difference it makes: (1) Try both 4BSD and ULE in 7.0 -- they have different properties, and at the very least it would be nice to see what impact it has. (2) Statically compile the 5.4 binary, and run the same binary on both 5.4 and 7.0 -- there have been lots of compiler changes, which might be relevant. Also, can you confirm that you're running either 32-bit or 64-bit kernels consistently on both versions of FreeBSD? Robert N M Watson Computer Laboratory University of Cambridge > > I tried to get as many ducks in a row before posting this, cause i don't want > to waste any of the developers precious time, but I can't guarantee that my > methods were not flawed. > > For simplicity, I used a port called ubench (the latest version 0.3, which I > know is quite old) to get the following numbers : > > Since I was doing this on the same machine, with completely different builds > (not simply a compile upgrade, but a full install), I figure it doesn't > really matter what kind of machine it is, but just for grins, it is a Dual > Opteron with 2GB of memory in it, compiled with the i386 confs. > > The 7.0 is compiled with the ULE scheduler... > > The following are averages of at least 5 runs : > > FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 > FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 > > Now, I really don't know exactly what the ubench program is doing, but I > think the description says that it is doing random integer and floating point > operations for the CPU tests, and random memory allocation and copying for > the memory test. > > So, can we explain the difference???? It looks like the latest SMP code > allows it to process more operations, but what happened to the memory > operations???? > > Just to get an idea of what this was going to do to my scripts, I tried some > benchmarks for those as well. > > I tried to run a PHP script using php 4.4.7 and got the following results : > > Using "time php index.php" to get the real time : > > FreeBSD 5.4 - 0.290 seconds > FreeBSD 7.0 - 0.335 seconds > > So, do the slower memory operations cause that difference in the real time it > takes to run that script??? > > Thanks, > > Tim. > > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to > "freebsd-performance-unsubscribe@freebsd.org" > From tt-list at simplenet.com Wed Aug 13 16:46:46 2008 From: tt-list at simplenet.com (Tim Traver) Date: Wed Aug 13 16:46:53 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <20080813062731.GZ16977@elvis.mu.org> References: <48A1F379.2040805@simplenet.com> <20080813062731.GZ16977@elvis.mu.org> Message-ID: <48A30F8C.9020805@simplenet.com> Alfred Perlstein wrote: > Hey Tim, please try a later version of FreeBSD 7, there's been > many improvements in the malloc(3) code since 7.0 so these > results aren't very meaningful. > > Can you let us know what you see with 7-stable? > > thanks, > -Alfred > > > Alfred, Thanks for responding, but I was using the 7.0 stable source that I checked out about a week and a half ago...Is that not the current??? Tim. > * Tim Traver [080812 14:39] wrote: > >> Hi All, >> >> I have recently had the opportunity to upgrade a few servers from old >> versions of 5.4 to 7.0, and have seen some interesting data. Before >> doing this, I wanted to take some benchmarks to see how the scripts that >> I would run would fare between the two versions, and the results are >> somewhat confusing... >> >> I tried to get as many ducks in a row before posting this, cause i don't >> want to waste any of the developers precious time, but I can't guarantee >> that my methods were not flawed. >> >> For simplicity, I used a port called ubench (the latest version 0.3, >> which I know is quite old) to get the following numbers : >> >> Since I was doing this on the same machine, with completely different >> builds (not simply a compile upgrade, but a full install), I figure it >> doesn't really matter what kind of machine it is, but just for grins, it >> is a Dual Opteron with 2GB of memory in it, compiled with the i386 confs. >> >> The 7.0 is compiled with the ULE scheduler... >> >> The following are averages of at least 5 runs : >> >> FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 >> FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 >> >> Now, I really don't know exactly what the ubench program is doing, but I >> think the description says that it is doing random integer and floating >> point operations for the CPU tests, and random memory allocation and >> copying for the memory test. >> >> So, can we explain the difference???? It looks like the latest SMP code >> allows it to process more operations, but what happened to the memory >> operations???? >> >> Just to get an idea of what this was going to do to my scripts, I tried >> some benchmarks for those as well. >> >> I tried to run a PHP script using php 4.4.7 and got the following results : >> >> Using "time php index.php" to get the real time : >> >> FreeBSD 5.4 - 0.290 seconds >> FreeBSD 7.0 - 0.335 seconds >> >> So, do the slower memory operations cause that difference in the real >> time it takes to run that script??? >> >> Thanks, >> >> Tim. >> >> _______________________________________________ >> freebsd-performance@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-performance >> To unsubscribe, send any mail to >> "freebsd-performance-unsubscribe@freebsd.org" >> > > From tt-list at simplenet.com Wed Aug 13 16:52:16 2008 From: tt-list at simplenet.com (Tim Traver) Date: Wed Aug 13 16:52:58 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: References: <48A1F379.2040805@simplenet.com> Message-ID: <48A310D7.50005@simplenet.com> Robert Watson wrote: > > On Tue, 12 Aug 2008, Tim Traver wrote: > >> I have recently had the opportunity to upgrade a few servers from old >> versions of 5.4 to 7.0, and have seen some interesting data. Before >> doing this, I wanted to take some benchmarks to see how the scripts >> that I would run would fare between the two versions, and the results >> are somewhat confusing... > > There are potentially a lot of variables here, you migh want to try > fiddling with the following and see what difference it makes: > > (1) Try both 4BSD and ULE in 7.0 -- they have different properties, > and at the > very least it would be nice to see what impact it has. > > (2) Statically compile the 5.4 binary, and run the same binary on both > 5.4 and > 7.0 -- there have been lots of compiler changes, which might be > relevant. > > Also, can you confirm that you're running either 32-bit or 64-bit > kernels consistently on both versions of FreeBSD? > > Robert N M Watson > Computer Laboratory > University of Cambridge > Robert, Yes, I agree, there are a lot of moving variables. 1) I did try the 4BSD scheduler too, and found that it was actually much worse. It may be because the ubench will spawn a few processes, and ULE is better at SMP than 4BSD is, but I don't know... 2) Unfortunately, I have now already replaced the 5.4 machines with 7.0. I tried to take the benchmarks before I rebuilt things. Like I said, I'm sure my methods were flawed... These were both compiled with the 32 bit code... Is there anything that I can do on this latest 7.0 box that might be useful information??? Thanks, Tim. >> >> I tried to get as many ducks in a row before posting this, cause i >> don't want to waste any of the developers precious time, but I can't >> guarantee that my methods were not flawed. >> >> For simplicity, I used a port called ubench (the latest version 0.3, >> which I know is quite old) to get the following numbers : >> >> Since I was doing this on the same machine, with completely different >> builds (not simply a compile upgrade, but a full install), I figure >> it doesn't really matter what kind of machine it is, but just for >> grins, it is a Dual Opteron with 2GB of memory in it, compiled with >> the i386 confs. >> >> The 7.0 is compiled with the ULE scheduler... >> >> The following are averages of at least 5 runs : >> >> FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 >> FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 >> >> Now, I really don't know exactly what the ubench program is doing, >> but I think the description says that it is doing random integer and >> floating point operations for the CPU tests, and random memory >> allocation and copying for the memory test. >> >> So, can we explain the difference???? It looks like the latest SMP >> code allows it to process more operations, but what happened to the >> memory operations???? >> >> Just to get an idea of what this was going to do to my scripts, I >> tried some benchmarks for those as well. >> >> I tried to run a PHP script using php 4.4.7 and got the following >> results : >> >> Using "time php index.php" to get the real time : >> >> FreeBSD 5.4 - 0.290 seconds >> FreeBSD 7.0 - 0.335 seconds >> >> So, do the slower memory operations cause that difference in the real >> time it takes to run that script??? >> >> Thanks, >> >> Tim. >> >> _______________________________________________ >> freebsd-performance@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-performance >> To unsubscribe, send any mail to >> "freebsd-performance-unsubscribe@freebsd.org" >> From kris at FreeBSD.org Wed Aug 13 17:21:29 2008 From: kris at FreeBSD.org (Kris Kennaway) Date: Wed Aug 13 17:21:35 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A310D7.50005@simplenet.com> References: <48A1F379.2040805@simplenet.com> <48A310D7.50005@simplenet.com> Message-ID: <48A31812.30908@FreeBSD.org> Tim Traver wrote: > Is there anything that I can do on this latest 7.0 box that might be > useful information??? Someone will need to repeat this under controlled conditions. It's quite a surprising result. Kris From tt-list at simplenet.com Wed Aug 13 17:48:53 2008 From: tt-list at simplenet.com (Tim Traver) Date: Wed Aug 13 17:49:00 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A31812.30908@FreeBSD.org> References: <48A1F379.2040805@simplenet.com> <48A310D7.50005@simplenet.com> <48A31812.30908@FreeBSD.org> Message-ID: <48A31E84.2080502@simplenet.com> Kris Kennaway wrote: > Tim Traver wrote: > >> Is there anything that I can do on this latest 7.0 box that might be >> useful information??? > > Someone will need to repeat this under controlled conditions. It's > quite a surprising result. > > Kris Kris, If you can outline a procedure for me, I might be able to do these tests for you, as I still have the 5.4 disk available. Couple of things though...I'm not sure what code base the 5.4 box had on it...I could maybe update the src and recompile the kernel. I would probably have to have someone else take a look at exactly what ubench is doing, and if it is indeed a test that makes sense to use...anyone? Once I am finished, I would assume you would need things like dmesg output, the compiling conf parameters, etc...tell me what would be good... If this is not controlled enough, then you might have to have one of the performance team do it I guess... Let me know, Tim. > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to > "freebsd-performance-unsubscribe@freebsd.org" From kris at FreeBSD.org Wed Aug 13 17:54:31 2008 From: kris at FreeBSD.org (Kris Kennaway) Date: Wed Aug 13 17:54:37 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A31E84.2080502@simplenet.com> References: <48A1F379.2040805@simplenet.com> <48A310D7.50005@simplenet.com> <48A31812.30908@FreeBSD.org> <48A31E84.2080502@simplenet.com> Message-ID: <48A31FD1.8080306@FreeBSD.org> Tim Traver wrote: > > Kris Kennaway wrote: >> Tim Traver wrote: >> >>> Is there anything that I can do on this latest 7.0 box that might be >>> useful information??? >> Someone will need to repeat this under controlled conditions. It's >> quite a surprising result. >> >> Kris > Kris, > > If you can outline a procedure for me, I might be able to do these tests > for you, as I still have the 5.4 disk available. > > Couple of things though...I'm not sure what code base the 5.4 box had on > it...I could maybe update the src and recompile the kernel. > > I would probably have to have someone else take a look at exactly what > ubench is doing, and if it is indeed a test that makes sense to > use...anyone? > > Once I am finished, I would assume you would need things like dmesg > output, the compiling conf parameters, etc...tell me what would be good... > > If this is not controlled enough, then you might have to have one of the > performance team do it I guess... > Robert outlined the steps that need to be done to begin with. Kris From tt-list at simplenet.com Wed Aug 13 19:03:51 2008 From: tt-list at simplenet.com (Tim Traver) Date: Wed Aug 13 19:03:59 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: References: <48A1F379.2040805@simplenet.com> Message-ID: <48A33015.2080900@simplenet.com> Robert Watson wrote: > > On Tue, 12 Aug 2008, Tim Traver wrote: > >> I have recently had the opportunity to upgrade a few servers from old >> versions of 5.4 to 7.0, and have seen some interesting data. Before >> doing this, I wanted to take some benchmarks to see how the scripts >> that I would run would fare between the two versions, and the results >> are somewhat confusing... > > There are potentially a lot of variables here, you migh want to try > fiddling with the following and see what difference it makes: > > (1) Try both 4BSD and ULE in 7.0 -- they have different properties, > and at the > very least it would be nice to see what impact it has. > > (2) Statically compile the 5.4 binary, and run the same binary on both > 5.4 and > 7.0 -- there have been lots of compiler changes, which might be > relevant. > > Also, can you confirm that you're running either 32-bit or 64-bit > kernels consistently on both versions of FreeBSD? > > Robert N M Watson > Computer Laboratory > University of Cambridge > > Robert, ok, I looked and it looks like the port compiles statically, and I was able to grab the binary from the old disk and move it over to the new one... here is info now on how it is linked : [root ~]# ldd ubench.5.4 ubench.5.4: libm.so.3 => /usr/local/lib/compat/libm.so.3 (0x2807e000) libc.so.5 => /usr/local/lib/compat/libc.so.5 (0x28099000) [root ~]# ldd /usr/local/bin/ubench /usr/local/bin/ubench: libm.so.5 => /lib/libm.so.5 (0x2807f000) libc.so.7 => /lib/libc.so.7 (0x28094000) where ubench is the locally compiled one... For reference, here are the old stats FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 And here is the run of the ubench.5.4 binary: FreeBSD 7.0 - CPU 139,623 - MEM - 207,180 And a rerun of the FreeBSD 7.0 ubench making sure there is absolutely no activity on the box FreeBSD 7.0 - CPU 200,562 - MEM - 107,695 That run is a little better than the previous one, but there seems to still be quite a difference in the memory tests... Does that show anything ???? Tim. From kris at FreeBSD.org Wed Aug 13 19:16:17 2008 From: kris at FreeBSD.org (Kris Kennaway) Date: Wed Aug 13 19:16:23 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A33015.2080900@simplenet.com> References: <48A1F379.2040805@simplenet.com> <48A33015.2080900@simplenet.com> Message-ID: <48A332FC.20600@FreeBSD.org> Tim Traver wrote: > > Robert Watson wrote: >> On Tue, 12 Aug 2008, Tim Traver wrote: >> >>> I have recently had the opportunity to upgrade a few servers from old >>> versions of 5.4 to 7.0, and have seen some interesting data. Before >>> doing this, I wanted to take some benchmarks to see how the scripts >>> that I would run would fare between the two versions, and the results >>> are somewhat confusing... >> There are potentially a lot of variables here, you migh want to try >> fiddling with the following and see what difference it makes: >> >> (1) Try both 4BSD and ULE in 7.0 -- they have different properties, >> and at the >> very least it would be nice to see what impact it has. >> >> (2) Statically compile the 5.4 binary, and run the same binary on both >> 5.4 and >> 7.0 -- there have been lots of compiler changes, which might be >> relevant. >> >> Also, can you confirm that you're running either 32-bit or 64-bit >> kernels consistently on both versions of FreeBSD? >> >> Robert N M Watson >> Computer Laboratory >> University of Cambridge >> >> > Robert, > > ok, I looked and it looks like the port compiles statically, and I was > able to grab the binary from the old disk and move it over to the new one... > > here is info now on how it is linked : > > [root ~]# ldd ubench.5.4 > ubench.5.4: > libm.so.3 => /usr/local/lib/compat/libm.so.3 (0x2807e000) > libc.so.5 => /usr/local/lib/compat/libc.so.5 (0x28099000) > [root ~]# ldd /usr/local/bin/ubench > /usr/local/bin/ubench: > libm.so.5 => /lib/libm.so.5 (0x2807f000) > libc.so.7 => /lib/libc.so.7 (0x28094000) > > where ubench is the locally compiled one... > > For reference, here are the old stats > FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 > FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 > > And here is the run of the ubench.5.4 binary: > FreeBSD 7.0 - CPU 139,623 - MEM - 207,180 > > And a rerun of the FreeBSD 7.0 ubench making sure there is absolutely no activity on the box > FreeBSD 7.0 - CPU 200,562 - MEM - 107,695 > > That run is a little better than the previous one, but there seems to still be quite a difference in the memory tests... > > Does that show anything ???? It shows that if there is a difference it is probably in userland, not the kernel. The obvious guess is the new malloc in 7.0. As for whether it indicates a bug, someone would have to look more closely at what ubench does. The author's description of his benchmark doesn't inspire confidence: it does "rather senseless memory allocation and memory to memory copying operations for another 3 mins concurrently using several processes". Kris From tt-list at simplenet.com Wed Aug 13 21:00:54 2008 From: tt-list at simplenet.com (Tim Traver) Date: Wed Aug 13 21:01:01 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A332FC.20600@FreeBSD.org> References: <48A1F379.2040805@simplenet.com> <48A33015.2080900@simplenet.com> <48A332FC.20600@FreeBSD.org> Message-ID: <48A34B84.4090100@simplenet.com> Kris Kennaway wrote: > >>> >> Robert, >> >> ok, I looked and it looks like the port compiles statically, and I was >> able to grab the binary from the old disk and move it over to the new >> one... >> >> here is info now on how it is linked : >> >> [root ~]# ldd ubench.5.4 >> ubench.5.4: >> libm.so.3 => /usr/local/lib/compat/libm.so.3 (0x2807e000) >> libc.so.5 => /usr/local/lib/compat/libc.so.5 (0x28099000) >> [root ~]# ldd /usr/local/bin/ubench >> /usr/local/bin/ubench: >> libm.so.5 => /lib/libm.so.5 (0x2807f000) >> libc.so.7 => /lib/libc.so.7 (0x28094000) >> >> where ubench is the locally compiled one... >> >> For reference, here are the old stats >> FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 >> FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 >> >> And here is the run of the ubench.5.4 binary: >> FreeBSD 7.0 - CPU 139,623 - MEM - 207,180 >> >> And a rerun of the FreeBSD 7.0 ubench making sure there is absolutely >> no activity on the box >> FreeBSD 7.0 - CPU 200,562 - MEM - 107,695 >> >> That run is a little better than the previous one, but there seems to >> still be quite a difference in the memory tests... >> >> Does that show anything ???? > > It shows that if there is a difference it is probably in userland, not > the kernel. The obvious guess is the new malloc in 7.0. As for > whether it indicates a bug, someone would have to look more closely at > what ubench does. The author's description of his benchmark doesn't > inspire confidence: it does "rather senseless memory allocation and > memory to memory copying operations for another 3 mins concurrently > using several processes". > > Kris Kris, ok, so is there anything I can do to help????? or, I noticed you cc'ed some of the other performance guys...they going to check it out? Tim. From kris at FreeBSD.org Wed Aug 13 21:10:08 2008 From: kris at FreeBSD.org (Kris Kennaway) Date: Wed Aug 13 21:10:15 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A34B84.4090100@simplenet.com> References: <48A1F379.2040805@simplenet.com> <48A33015.2080900@simplenet.com> <48A332FC.20600@FreeBSD.org> <48A34B84.4090100@simplenet.com> Message-ID: <48A34DAB.6090204@FreeBSD.org> Tim Traver wrote: > Kris Kennaway wrote: >>> Robert, >>> >>> ok, I looked and it looks like the port compiles statically, and I was >>> able to grab the binary from the old disk and move it over to the new >>> one... >>> >>> here is info now on how it is linked : >>> >>> [root ~]# ldd ubench.5.4 >>> ubench.5.4: >>> libm.so.3 => /usr/local/lib/compat/libm.so.3 (0x2807e000) >>> libc.so.5 => /usr/local/lib/compat/libc.so.5 (0x28099000) >>> [root ~]# ldd /usr/local/bin/ubench >>> /usr/local/bin/ubench: >>> libm.so.5 => /lib/libm.so.5 (0x2807f000) >>> libc.so.7 => /lib/libc.so.7 (0x28094000) >>> >>> where ubench is the locally compiled one... >>> >>> For reference, here are the old stats >>> FreeBSD 5.4 - CPU 112,721 - MEM - 146,483 >>> FreeBSD 7.0 - CPU 177,339 - MEM - 95,920 >>> >>> And here is the run of the ubench.5.4 binary: >>> FreeBSD 7.0 - CPU 139,623 - MEM - 207,180 >>> >>> And a rerun of the FreeBSD 7.0 ubench making sure there is absolutely >>> no activity on the box >>> FreeBSD 7.0 - CPU 200,562 - MEM - 107,695 >>> >>> That run is a little better than the previous one, but there seems to >>> still be quite a difference in the memory tests... >>> >>> Does that show anything ???? >> It shows that if there is a difference it is probably in userland, not >> the kernel. The obvious guess is the new malloc in 7.0. As for >> whether it indicates a bug, someone would have to look more closely at >> what ubench does. The author's description of his benchmark doesn't >> inspire confidence: it does "rather senseless memory allocation and >> memory to memory copying operations for another 3 mins concurrently >> using several processes". >> >> Kris > Kris, > > ok, so is there anything I can do to help????? or, I noticed you cc'ed > some of the other performance guys...they going to check it out? jasone is the je in jemalloc, so maybe he will be able to comment on whether whatever the heck ubench does is an abnormally pessimal case for it, or something. Kris From jasone at FreeBSD.org Wed Aug 13 22:35:02 2008 From: jasone at FreeBSD.org (Jason Evans) Date: Wed Aug 13 22:38:04 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A332FC.20600@FreeBSD.org> References: <48A1F379.2040805@simplenet.com> <48A33015.2080900@simplenet.com> <48A332FC.20600@FreeBSD.org> Message-ID: <48A35D7F.3010805@FreeBSD.org> Kris Kennaway wrote: > Tim Traver wrote: >> And here is the run of the ubench.5.4 binary: >> FreeBSD 7.0 - CPU 139,623 - MEM - 207,180 >> >> And a rerun of the FreeBSD 7.0 ubench making sure there is absolutely >> no activity on the box >> FreeBSD 7.0 - CPU 200,562 - MEM - 107,695 >> >> That run is a little better than the previous one, but there seems to >> still be quite a difference in the memory tests... >> >> Does that show anything ???? > > It shows that if there is a difference it is probably in userland, not > the kernel. The obvious guess is the new malloc in 7.0. As for whether > it indicates a bug, someone would have to look more closely at what > ubench does. The author's description of his benchmark doesn't inspire > confidence: it does "rather senseless memory allocation and memory to > memory copying operations for another 3 mins concurrently using several > processes". The ubench memory benchmark operates almost entirely on 1024B buffers, which is nearly worst case for jemalloc. Also, its memory use fluctuates wildly, in a pattern that causes a lot of dirty page flushing and chunk map/unmap activity. That is where most of the difference is; jemalloc is more aggressive/effective in returning pages to the VM than is phkmalloc. In order to verify the cause of the performance difference, I ran ubench (on an 8-current system) with MALLOC_OPTIONS=7F6K (avoid flushing dirty pages, and use 64-MiB chunks in order to avoid repeatedly mapping/unmapping chunks), and the ubench memory benchmark sped up by ~51%. With the default configuration, jemalloc was ~13% slower than phkmalloc, but with 7F6K it was ~31% faster than phkmalloc. On possible factor for stock FreeBSD 7.0 is a scalability issue that I MFC'ed a fix for in r176922 on 7 March (shortly after the 7.0 release). And, there's a non-trivial overall performance improvement that I'm planning to MFC this week. I encourage you to find some better way of testing memory performance than ubench. Generic malloc benchmarking is *hard*. The most effective approach for someone not specifically interested in allocators is to benchmark the actual applications that will be run in production. If you find that jemalloc performs poorly in such circumstances, please let me know the details so that I can look into possible improvements. Thanks, Jason From won.derick at yahoo.com Fri Aug 15 07:26:46 2008 From: won.derick at yahoo.com (Won De Erick) Date: Fri Aug 15 07:26:57 2008 Subject: Fw: CPU Utilization on IBM x3755 Message-ID: <847976.42317.qm@web45803.mail.sp1.yahoo.com> forwarding the thread to freebsd-testing@freebsd.org, freebsd-performance@freebsd.org Hello, I was wondering what are the processes running on my machine after checking the CPU utilization using ps and top commands. My Platform is IBM x3755 (w/ 8 CPUs) running FreeBSD 6.2. 1. Using top -S last pid: 90083; load averages: 0.22, 0.19, 0.17 up 5+00:26:5515:27:07 186 processes: 19 running, 152 sleeping, 15 waiting CPU states: 0.9 % user, 0.0 % nice, 7.2 % system, 0.0 % interrupt, 91.9% idle Mem: 54M Active, 22M Inact, 76M Wired, 26M Buf, 29G Free Swap: 6144M Total, 6144M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K RUN 0 101.2H 98.97% idle: cpu0 13 root 1 171 52 0K 16K RUN 4 105.7H 96.83% idle: cpu4 15 root 1 171 52 0K 16K RUN 2 114.1H 96.04% idle: cpu2 12 root 1 171 52 0K 16K CPU5 5 109.8H 93.75% idle: cpu5 16 root 1 171 52 0K 16K RUN 1 109.5H 86.18% idle: cpu1 14 root 1 171 52 0K 16K RUN 3 113.7H 86.04% idle: cpu3 10 root 1 171 52 0K 16K RUN 7 111.4H 83.98% idle: cpu7 11 root 1 171 52 0K 16K RUN 6 110.8H 81.98% idle: cpu6 57213 root 10 20 0 14512K 4548K RUN 7 388:19 0.00% smd 18 root 1 -32 -151 0K 16K CPU0 0 83:07 0.00% swi4: clock sio 45 root 1 171 52 0K 16K pgzero 3 3:05 0.00% pagezero . . . 57094 root 1 8 0 31356K 9180K wait 6 0:01 0.00% php-cgi 18533 admin 1 96 0 30732K 4308K select 6 0:01 0.00% sshd90137 0.29, 0.20, 0.17 up 5+00:26:5715:27:09 [a] Average CPU Utilization = 0.9 % user + 7.2 % system = 8.1% = 100 - 91.9 = 8.1% 98.97 + 96.83 + 96.04 + 93.75 + 86.18 + 86.04 + 83.98 + 81.98 [b] Average CPU Utilization = 100 - ----------------------------------------------------------------------- = 9.53 % 8 2. Using ps -aux USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 17 99.0 0.0 0 16 ?? RL Fri03PM 6074:32.36 [idle: cpu0] root 13 97.0 0.0 0 16 ?? RL Fri03PM 6339:48.43 [idle: cpu4] root 15 96.2 0.0 0 16 ?? RL Fri03PM 6847:42.20 [idle: cpu2] root 12 94.4 0.0 0 16 ?? RL Fri03PM 6586:50.10 [idle: cpu5] root 16 85.9 0.0 0 16 ?? RL Fri03PM 6567:57.05 [idle: cpu1] root 14 85.4 0.0 0 16 ?? RL Fri03PM 6823:20.19 [idle: cpu3] root 10 85.2 0.0 0 16 ?? RL Fri03PM 6683:28.54 [idle: cpu7] root 11 81.9 0.0 0 16 ?? RL Fri03PM 6646:11.63 [idle: cpu6] root 0 0.0 0.0 0 0 ?? WLs Fri03PM 0:00.00 [swapper] root 1 0.0 0.0 912 460 ?? ILs Fri03PM 0:00.58 /sbin/init -- . . . admin 19027 0.0 0.0 12480 3384 p2 Ss+ 12:06PM 0:00.13 -clish (clish) root 19207 0.0 0.0 2604 940 p2 S+ 12:07PM 0:07.65 /usr/sbin/clog -f /var/log/dhcpd.log 99 + 97 + 96.2 + 94.4 + 85.9+ 85.4 + 85.2 + 81.9 Average CPU Utilization = 100 - --------------------------------------------------------- = 9.375 % 8 Question 1: What is the difference between 1 [a] and 1 [b]? Question 2: What are the processes (system?) that are running that resulted to 1 [b]? top -S is just giving a name idle: cpu0, idle: cpu1, etc. under the command column. Question 3: Is it logical to compare the average CPU utilization results obtained in number 1 and 2? How are they differ? Question 4. Are there other tools that I can use to accurately get the running processes that are eating the actual CPU resources on the machine? Please shed me some lights.. Thanks, Won From koitsu at FreeBSD.org Fri Aug 15 12:06:49 2008 From: koitsu at FreeBSD.org (Jeremy Chadwick) Date: Fri Aug 15 12:18:21 2008 Subject: CPU Utilization on IBM x3755 In-Reply-To: <224002.6575.qm@web45816.mail.sp1.yahoo.com> References: <224002.6575.qm@web45816.mail.sp1.yahoo.com> Message-ID: <20080815114655.GA83634@eos.sc1.parodius.com> On Fri, Aug 15, 2008 at 02:03:06AM -0700, Won De Erick wrote: > > The utilities you're using are correct (ps and top), but I don't know > > why you're using top -S since it's pretty apparent you don't know how to > > read the output. :-) > > thanks for the lights. > > I may not be well verse in interpreting the output, but I am using top -S to make other system processes (like pager, swapper) visible. > I just wondered why the command name should be idle : cpu0, etc. instead of giving a little bit more descriptive name (like what you said, kernel thread bla bla). With this, it would be more understandable. An ordinary user like me could mistakenly interpret it as an "actual" process. First and foremost, I'm not sure why you cross-posted this on 3 separate lists (testing, performance, and hardware). You probably should have posted this on freebsd-questions, and if no response after a week or so, again on freebsd-stable (although you're using FreeBSD 6.2). It's generally shunned by the FreeBSD mailing list community to cross-post to so many lists. Just something to keep in mind for the future. > > You shouldn't be using -S if you're just interested in actual processes > > on the UNIX machine itself. > > Then what should I use? I am interested in getting the detailed info to justify the %CPU idles and utilization. Of the entire machine? You can see that in top's header: last pid: 84636; load averages: 0.47, 0.14, 0.04 up 79+01:15:18 04:46:01 75 processes: 2 running, 70 sleeping, 3 stopped CPU: 28.6% user, 0.0% nice, 8.2% system, 0.0% interrupt, 63.3% idle Mem: 207M Active, 2433M Inact, 212M Wired, 91M Cache, 112M Buf, 57M Free Swap: 8192M Total, 228K Used, 8192M Free Regarding why you're adding up all the individual process statistics: I can imagine they would vary a slight bit, but I cannot explain a 7% variance. Someone with more knowledge will have to assist there. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From cracauer at cons.org Fri Aug 15 16:46:28 2008 From: cracauer at cons.org (Martin Cracauer) Date: Fri Aug 15 16:46:35 2008 Subject: 7.0 CPU and Memory Performance In-Reply-To: <48A1F379.2040805@simplenet.com> References: <48A1F379.2040805@simplenet.com> Message-ID: <20080815160649.GA60871@cons.org> Tim Traver wrote on Tue, Aug 12, 2008 at 01:32:57PM -0700: > > For simplicity, I used a port called ubench (the latest version 0.3, > which I know is quite old) to get the following numbers : ubench is just another useless artificial benchmark with no base in reality. I forgot the specifics but last time I looked into what exactly it is doing I threw it out instantly. If you really need to know I might be able to dig up some old notes from the time, or maybe I bashed it publically somewhere. Martin -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Martin Cracauer http://www.cons.org/cracauer/ FreeBSD - where you want to go, today. http://www.freebsd.org/ From won.derick at yahoo.com Sat Aug 16 08:18:20 2008 From: won.derick at yahoo.com (Won De Erick) Date: Sat Aug 16 08:18:27 2008 Subject: CPU Utilization on IBM x3755 Message-ID: <897559.10614.qm@web45814.mail.sp1.yahoo.com> >First and foremost, I'm not sure why you cross-posted this on 3 separate >lists (testing, performance, and hardware). You probably should have >posted this on freebsd-questions, and if no response after a week or so, >again on freebsd-stable (although you're using FreeBSD 6.2). >It's generally shunned by the FreeBSD mailing list community to >cross-post to so many lists. Just something to keep in mind for the >future. Thanks for this reminder. I am testing my hardware against FreeBSD, and I just want to reach those people under the said areas (testing, performance, hardware) with the hope to get speedy response. >> Then what should I use? I am interested in getting the detailed info to justify the %CPU idles and utilization. >Of the entire machine? You can see that in top's header: >last pid: 84636; load averages: 0.47, 0.14, 0.04 up 79+01:15:18 04:46:01 >75 processes: 2 running, 70 sleeping, 3 stopped >CPU: 28.6% user, 0.0% nice, 8.2% system, 0.0% interrupt, 63.3% idle >Mem: 207M Active, 2433M Inact, 212M Wired, 91M Cache, 112M Buf, 57M Free >Swap: 8192M Total, 228K Used, 8192M Free >Regarding why you're adding up all the individual process statistics: I >can imagine they would vary a slight bit, but I cannot explain a 7% >variance. Someone with more knowledge will have to assist there. Though the header could be conclusive, I should want the specific processes (or threads). And come up with a list of breakdowns, like: User (28.6%) 1. Program1 -- 20.0% 2. Program2 -- 8.6% System(8.2%) 1. SystemProcess1 -- X % 2. (and so on) -- Y % Then match this by adding up all individual processes statistics. And if I couldn't match, at least I could tell some factors that cause variance. This variance has really struck my attention. When I run "top -SI", the result was: last pid: 2746; load averages: 0.20, 0.16, 0.10 up 0+00:07:38 15:54:26 125 processes: 12 running, 100 sleeping, 16 waiting CPU states: 0.3% user, 0.0% nice, 6.4% system, 0.0% interrupt, 93.2% idle Mem: 42M Active, 18M Inact, 62M Wired, 20M Buf, 27G Free Swap: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 17 root 1 171 52 0K 16K RUN 0 6:05 99.02% idle: cpu0 15 root 1 171 52 0K 16K CPU2 2 6:05 99.02% idle: cpu2 14 root 1 171 52 0K 16K CPU3 3 6:28 96.58% idle: cpu3 10 root 1 171 52 0K 16K CPU7 7 6:19 92.24% idle: cpu7 11 root 1 171 52 0K 16K CPU6 6 6:22 92.09% idle: cpu6 16 root 1 171 52 0K 16K CPU1 1 6:30 91.26% idle: cpu1 12 root 1 171 52 0K 16K CPU5 5 6:26 84.47% idle: cpu5 13 root 1 171 52 0K 16K CPU4 4 6:36 83.50% idle: cpu4 The header says 12, but there were only 8 processes displayed. Sometimes it goes down to 10, but not 8. Hope you can shed me more lights on this. From chris at shagged.org Sat Aug 16 17:33:41 2008 From: chris at shagged.org (Chris Elsworth) Date: Sat Aug 16 17:33:48 2008 Subject: CPU Utilization on IBM x3755 In-Reply-To: <897559.10614.qm@web45814.mail.sp1.yahoo.com> References: <897559.10614.qm@web45814.mail.sp1.yahoo.com> Message-ID: <20080816173339.GA34116@thestra.telon.net> On Sat, Aug 16, 2008 at 01:18:18AM -0700, Won De Erick wrote: > >Regarding why you're adding up all the individual process statistics: I > >can imagine they would vary a slight bit, but I cannot explain a 7% > >variance. Someone with more knowledge will have to assist there. > > Though the header could be conclusive, I should want the specific processes (or threads). > And come up with a list of breakdowns, like: > > Then match this by adding up all individual processes statistics. And if I couldn't match, at least I could tell some factors that cause variance. At the bottom of man top, reformatted for email width: As with ps(1), things can change while top is collecting information for an update. The picture it gives is only a close approximation to reality. Could this explain the variations you are seeing? top never gives you an accurate snapshot of what the system is doing at any one instant in time, afaik. -- Chris From hakmi at rogers.com Sat Aug 16 22:23:20 2008 From: hakmi at rogers.com (Tamouh Hakmi) Date: Sat Aug 16 22:23:27 2008 Subject: getaddrinfo() failed in Apache 2.2 + FreeBSD 6.1 Message-ID: <050201c8ffea$f8668e80$6900a8c0@tamouh> Hi, I'm working on a problem with Apache 2.2 + PHP on FreeBSD 6.1 x86 Recently, I've upgraded from Apache 1.3 to v2.2 , and since then PHP is unable to resolve hostnames unless they're specified in /etc/hosts . The error we'd get would be: php_network_getaddresses: getaddrinfo failed: hostname nor servname provided, or not known in /home/username/public_html/testphp.php on line 2 This is for a simple function. I know it is not a DNS problem, because if PHP is setup as a CGI function, it works fine. It is only in DSO mode that PHP malfunction like this. I'm also having no issues with our resolver for mail, ping or any other services. Some people were pointing out that this is a fault of FreeBSD reaching a maximum number of file descriptors. Others said this can't be resolved until upgraded to FBSD 6.3 which I'm not planning to go through. There are several hundred domains hosted on the server, but all was working fine with Apache 1.3. Our VNODES are a bit high, but haven't seen any errors: server# sysctl kern.maxvnodes kern.maxvnodes: 100000 server# sysctl vfs.numvnodes vfs.numvnodes: 84805 server# sysctl kern.maxfiles kern.maxfiles: 65536 kern.maxfilesperproc: 32767 Later on I've discovered that if I comment about 500 CustomLog entries in Apache, the errors will disappear. This seems like an issue with a certain limit with open files. Anyone be able to guide me in the right direction here? Thanks, Tamouh From won.derick at yahoo.com Sun Aug 17 08:36:18 2008 From: won.derick at yahoo.com (Won De Erick) Date: Sun Aug 17 08:36:35 2008 Subject: CPU Utilization on IBM x3755 Message-ID: <117484.97387.qm@web45813.mail.sp1.yahoo.com> On Sat, Aug 16, 2008 at 01:18:18AM -0700, Won De Erick wrote: >> >Regarding why you're adding up all the individual process statistics: I >> >can imagine they would vary a slight bit, but I cannot explain a 7% >> >variance. Someone with more knowledge will have to assist there. >> >> Though the header could be conclusive, I should want the specific processes (or threads). >> And come up with a list of breakdowns, like: >> >> Then match this by adding up all individual processes statistics. And if I couldn't match, at least I could tell some factors that cause variance. > >At the bottom of man top, reformatted for email width: > > As with ps(1), things can change while top is collecting > information for an update. The picture it gives is only > a close approximation to reality. > >Could this explain the variations you are seeing? top never gives you >an accurate snapshot of what the system is doing at any one instant in >time, afaik. >-- >Chris I'm aware with this limitation as displayed using man top. But when I ran top -SI, I only got 8 processes out of 12 processes [actively] running as displayed on the header making it far from reality. I did this several times, restarting the machine, bootstrapping, etc, but I got same result. I don't know what kernel threads are causing top -S to display the command as "idle: cpu0", "idle : cpu1", ... "idle:cpu7". Though Jeremy had mentioned them as simply kernel threads (due to lack of term), I should at least dig deeper on this matter and be able to specifically classify them.