From julian at elischer.org Mon Feb 2 12:29:49 2009 From: julian at elischer.org (Julian Elischer) Date: Mon Feb 2 12:30:01 2009 Subject: Vimage globals vs structures measurements. In-Reply-To: <4984241B.5010103@elischer.org> References: <498414E5.7020904@elischer.org> <4984241B.5010103@elischer.org> Message-ID: <4987548A.7000609@elischer.org> Julian Elischer wrote: > Julian Elischer wrote: >> >> anyone who has commands and args for their favourite >> thing the'd like me to test... send it in.. >> >> >> so far using ttcp I have seem no measureable difference. >> >> but I have more tests to do of course.. >> >> for example throughput with small packets with ttcp (KB/Sec).... >> >> >> x VIMAGE_GLOBALS >> + NO_VIMAGE_GLOBALS >> +-----------------------------------------------------------------+ >> | + xx | >> | + xxx + | >> | + xxx x ++++ | >> | x + x + + xxxxxxx +++++ | >> |x + ++ xx xxx + ++++xxx x x x +++++ ***xxxxx ++++++++| >> | |_____________A______M______| | >> | |________________AM________________| | >> +-----------------------------------------------------------------+ >> N Min Max Median Avg Stddev >> x 40 48016.01 57361.32 56268.06 54915.582 2554.0133 >> + 40 48999.66 59646.59 56261.58 56086.798 3119.1782 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > as I said before mst of my tests have shown no real change but this one > has the most change I've seen.. it's 160 byte udp packets sent between > two identical machines (both using the same kernel each time). > > > x VIMAGE_GLOBALS > + NO_VIMAGE_GLOBALS > +-----------------------------------------------------------------+ > | + + ++ xx x x | > | + + ++ +x++x +xx x x | > | + + +++ + +*+**x+xxxx x | > | + +++ +++x*++*+**x*x*xx x x x | > | + +*+++++x**+*+**x*x*x*xx x x xx | > | ++++*++++****+*+**x*x****x xxxx xxx | > | + + xx + ++++*++*+****+***********x*xxxxx xxxx x| > |+ +*+++ xx++*+*+*+****+****************x***x*xxx*xx x xx x| > | |__________A__________| | > | |_________A________| | > +-----------------------------------------------------------------+ > N Min Max Median Avg Stddev > x 150 10175.11 11292.11 10763.80 10760.77 200.92124 > + 150 10075.64 11019.12 10591.68 10580.059 172.29227 > Difference at 95.0% confidence > -180.711 +/- 42.3572 > -1.67935% +/- 0.393626% > (Student's t, pooled s = 187.155) > > this one showed a 1.7% slowdown > where the one above showed a half percent speedup > (but not considered significant). > > The first one shown above was TCP with 1500 byte packets on bge 1G > interfaces.. > > more test ideas appreciated... more tests.. this one with iperf... x NO_VIMAGE_GLOBALS + VIMAGE_GLOBALS +-----------------------------------------------------------------+ | + x x x | | + + x x x x | | + + + + x x x x | | + + + + x x x x | | + + + + + x x x x x | | + + + + * x x x x x x | | + + + + * x * x x x x | | + + + + + * * * x x x x | | + + + + + + * * * x x x x | | + + + + + + + * * * x x x x | | + + + + + + + * * * * x x x x x | | + + + + + + * * * * * * x x x x | | + + + + + + * * * * * * x x x x | | + + + + + + * * * * * * * * x * x x | |x + + + + * * * * * * * * * * * * x x x| | |________A_________| | | |________MA_________| | +-----------------------------------------------------------------+ N Min Max Median Avg Stddev x 120 418 441 435 435.025 3.4089908 + 120 423 438 429 429.51667 3.4664862 Difference at 95.0% confidence -5.50833 +/- 0.869898 -1.26621% +/- 0.199965% (Student's t, pooled s = 3.43786) bigger is better... In this case we see that NO_VIMAGE_GLOBALS is better. Over several iterations I have come to the conclusion that other factors are overwhelming this change and that the effect of clustering all the 'global' variables together into a single global structure is negligible. If I can get some confirmation of this by others then the next step would be to simply remove the VIMAGE_GLOBALS option and all the global variables it covers. At least that's what seems next to me.. see: http://wiki.freebsd.org/Image/Notes200808DevSummit > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From bzeeb-lists at lists.zabbadoz.net Wed Feb 4 11:00:14 2009 From: bzeeb-lists at lists.zabbadoz.net (Bjoern A. Zeeb) Date: Wed Feb 4 11:00:21 2009 Subject: Vimage globals vs structures measurements. In-Reply-To: <4987548A.7000609@elischer.org> References: <498414E5.7020904@elischer.org> <4984241B.5010103@elischer.org> <4987548A.7000609@elischer.org> Message-ID: <20090204184526.I93725@maildrop.int.zabbadoz.net> On Mon, 2 Feb 2009, Julian Elischer wrote: Hi, > If I can get some confirmation of this by others then > the next step would be to simply remove the VIMAGE_GLOBALS option > and all the global variables it covers. > > At least that's what seems next to me.. no, the next step is to bring in the beaf (last step). I think we had clearly decided (somewhen, somewho) that we want one version with all three options at the same time. Once we are confident, hopefully after a few days at that point, VIMAGE_GLOBALS will go away. So please do not rape that out. In two months there were no real accidents wrt. VIMAGE_GLOBALS even with all the larger changes that went in. I think it's safe to keep them another 4-6 weeks. /bz -- Bjoern A. Zeeb The greatest risk is not taking one. From bz at FreeBSD.org Wed Feb 4 11:15:09 2009 From: bz at FreeBSD.org (Bjoern A. Zeeb) Date: Wed Feb 4 11:15:20 2009 Subject: Vimage globals vs structures measurements. In-Reply-To: <20090204184526.I93725@maildrop.int.zabbadoz.net> References: <498414E5.7020904@elischer.org> <4984241B.5010103@elischer.org> <4987548A.7000609@elischer.org> <20090204184526.I93725@maildrop.int.zabbadoz.net> Message-ID: <20090204185656.B93725@maildrop.int.zabbadoz.net> On Wed, 4 Feb 2009, Bjoern A. Zeeb wrote: > On Mon, 2 Feb 2009, Julian Elischer wrote: > > Hi, > >> If I can get some confirmation of this by others then >> the next step would be to simply remove the VIMAGE_GLOBALS option >> and all the global variables it covers. >> >> At least that's what seems next to me.. > > no, the next step is to bring in the beaf (last step). ... beef ... anyway. The indirection, the real virtualization, the multiple images, ... you count my typos;) > I think we had clearly decided (somewhen, somewho) that we want one > version with all three options at the same time. > Once we are confident, hopefully after a few days at that point, > VIMAGE_GLOBALS will go away. > > So please do not rape that out. In two months there were no real > accidents wrt. VIMAGE_GLOBALS even with all the larger changes that > went in. I think it's safe to keep them another 4-6 weeks. > > /bz > > -- Bjoern A. Zeeb The greatest risk is not taking one. From ragnar at gatorhole.com Sat Feb 7 13:30:24 2009 From: ragnar at gatorhole.com (Ragnar Lonn) Date: Sat Feb 7 13:30:30 2009 Subject: More open sockets with vimages? Message-ID: <498DF945.3000702@gatorhole.com> Hi all, I am a longtime (well, since 2004 or so) vimage user. It's really nice to see this great stuff getting into the main branch! I have a quick question for the list: I want to be able to have a machine handle a *lot* of open network connections. Many systems have various issues with this, but I thought that if FreeBSD's vimages might be able to get around the problem. Is it possible to have more open network sockets using multiple vimages, than it is on a FreeBSD without vimage? Are sockets still a global resource, or do they get multiplied with each vimage? I apologize if this is a stupid question, I haven't been using FreeBSD+vimages in a while (although I hope to change that soon!) Cheers, /Ragnar From julian at elischer.org Sat Feb 7 14:42:10 2009 From: julian at elischer.org (Julian Elischer) Date: Sat Feb 7 14:42:16 2009 Subject: More open sockets with vimages? In-Reply-To: <498DF945.3000702@gatorhole.com> References: <498DF945.3000702@gatorhole.com> Message-ID: <498E0797.4040002@elischer.org> Ragnar Lonn wrote: > Hi all, > > I am a longtime (well, since 2004 or so) vimage user. It's really nice > to see this great stuff getting into the main branch! > > I have a quick question for the list: I want to be able to have a > machine handle a *lot* of open network connections. Many systems have > various issues with this, but I thought that if FreeBSD's vimages might > be able to get around the problem. Is it possible to have more open > network sockets using multiple vimages, than it is on a FreeBSD without > vimage? Are sockets still a global resource, or do they get multiplied > with each vimage? > > I apologize if this is a stupid question, I haven't been using > FreeBSD+vimages in a while (although I hope to change that soon!) sockets are a global resource that are assigned to vimages. However the amount of sockets available are tunable. how many are we talking about here? > > Cheers, > > /Ragnar > > > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to > "freebsd-virtualization-unsubscribe@freebsd.org" From ragnar at gatorhole.com Sun Feb 8 03:43:18 2009 From: ragnar at gatorhole.com (Ragnar Lonn) Date: Sun Feb 8 03:43:25 2009 Subject: More open sockets with vimages? In-Reply-To: <498E0797.4040002@elischer.org> References: <498DF945.3000702@gatorhole.com> <498E0797.4040002@elischer.org> Message-ID: <498EC554.4020905@gatorhole.com> Julian Elischer wrote: > sockets are a global resource that are assigned to vimages. > However the amount of sockets available are tunable. > how many are we talking about here? 100,000+ sockets. It seems to me like there is a need to be able to handle *many* open network connections as servers get more and more CPU cores, memory, and higher-speed network interfaces, but most people claim that it is very hard to get 100k open sockets working nicely on a single machine, even on a modern OS (though I've found a couple of people that say they can, also, on Linux systems). Ok if 65k sockets is the normal limit per process and per IP address, but for the whole OS, it just seems strange to limit things to 65k (or less). /Ragnar From ticso at cicely7.cicely.de Sun Feb 8 05:29:13 2009 From: ticso at cicely7.cicely.de (Bernd Walter) Date: Sun Feb 8 05:34:48 2009 Subject: More open sockets with vimages? In-Reply-To: <498EC554.4020905@gatorhole.com> References: <498DF945.3000702@gatorhole.com> <498E0797.4040002@elischer.org> <498EC554.4020905@gatorhole.com> Message-ID: <20090208130435.GL32126@cicely7.cicely.de> On Sun, Feb 08, 2009 at 12:43:16PM +0100, Ragnar Lonn wrote: > Julian Elischer wrote: > >sockets are a global resource that are assigned to vimages. > >However the amount of sockets available are tunable. > >how many are we talking about here? > > 100,000+ sockets. It seems to me like there is a need to be able to > handle *many* open network connections as servers get more and more CPU > cores, memory, and higher-speed network interfaces, but most people > claim that it is very hard to get 100k open sockets working nicely on a > single machine, even on a modern OS (though I've found a couple of > people that say they can, also, on Linux systems). Ok if 65k sockets is > the normal limit per process and per IP address, but for the whole OS, > it just seems strange to limit things to 65k (or less). This is simple maths: 100k Sockets with 32k TX and 64k RX buffer take 9G Memory. Just buffer space, not to mention socket state, ... On i386 this is limited by kmem, which defaults to IIRC 512MB and is limited by 32bit virtual address space on i386. On amd64 depending on the OS version you can have a kmem of slighty less than 2G max or several GB. Nevertheless you are still limited with physical RAM. Smaller buffers are possible, but usually people want larger buffers to keep up with recent line speeds. Today buffer sizes can be dynamic - don't know the exact details, but you should keep in mind that 32k/96k is already quite small for many purposes. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From ragnar at gatorhole.com Sun Feb 8 05:46:27 2009 From: ragnar at gatorhole.com (Ragnar Lonn) Date: Sun Feb 8 05:46:33 2009 Subject: More open sockets with vimages? In-Reply-To: <20090208130435.GL32126@cicely7.cicely.de> References: <498DF945.3000702@gatorhole.com> <498E0797.4040002@elischer.org> <498EC554.4020905@gatorhole.com> <20090208130435.GL32126@cicely7.cicely.de> Message-ID: <498EE22E.7020005@gatorhole.com> Bernd Walter wrote: > This is simple maths: > 100k Sockets with 32k TX and 64k RX buffer take 9G Memory. > Just buffer space, not to mention socket state, ... > On i386 this is limited by kmem, which defaults to IIRC 512MB and > is limited by 32bit virtual address space on i386. > On amd64 depending on the OS version you can have a kmem of slighty > less than 2G max or several GB. > Nevertheless you are still limited with physical RAM. > Smaller buffers are possible, but usually people want larger buffers > to keep up with recent line speeds. > Today buffer sizes can be dynamic - don't know the exact details, but > you should keep in mind that 32k/96k is already quite small for > many purposes. > But physical memory is cheap, and most low-end machines can have 16G or more today. Is it just a matter of having enough RAM and a 64-bit OS then? How much is "several GB [kmem]" that you mention above? /Ragnar From ticso at cicely7.cicely.de Sun Feb 8 06:42:03 2009 From: ticso at cicely7.cicely.de (Bernd Walter) Date: Sun Feb 8 06:42:10 2009 Subject: More open sockets with vimages? In-Reply-To: <498EE22E.7020005@gatorhole.com> References: <498DF945.3000702@gatorhole.com> <498E0797.4040002@elischer.org> <498EC554.4020905@gatorhole.com> <20090208130435.GL32126@cicely7.cicely.de> <498EE22E.7020005@gatorhole.com> Message-ID: <20090208144155.GN32126@cicely7.cicely.de> On Sun, Feb 08, 2009 at 02:46:22PM +0100, Ragnar Lonn wrote: > Bernd Walter wrote: > >This is simple maths: > >100k Sockets with 32k TX and 64k RX buffer take 9G Memory. > >Just buffer space, not to mention socket state, ... > >On i386 this is limited by kmem, which defaults to IIRC 512MB and > >is limited by 32bit virtual address space on i386. > >On amd64 depending on the OS version you can have a kmem of slighty > >less than 2G max or several GB. > >Nevertheless you are still limited with physical RAM. > >Smaller buffers are possible, but usually people want larger buffers > >to keep up with recent line speeds. > >Today buffer sizes can be dynamic - don't know the exact details, but > >you should keep in mind that 32k/96k is already quite small for > >many purposes. > > > > But physical memory is cheap, and most low-end machines can have 16G or > more today. Is it just a matter of having enough RAM and a 64-bit OS > then? How much is "several GB [kmem]" that you mention above? AFAIK it is the only limitation - people are using 100k+ sockets since at least FreeBSD-4, but with several restrictions because of memory. It mostly depends on your application and network topology to your peers. Don't know where the current kmem limits exactly are - AFAIK kmem is hold within KVA and KVA is limited by a static map size. It has been widely discussed recently, because ZFS loves a large kmem. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From julian at elischer.org Thu Feb 19 12:56:47 2009 From: julian at elischer.org (Julian Elischer) Date: Thu Feb 19 12:56:57 2009 Subject: Vimage next step Message-ID: <499DC0F7.10709@elischer.org> I've been doing performance testing on the 'non-vimage' 'structified' case VS the original 'globals' case and have not been able to see any really significant differences (though I have seen very slight differences in the distribution of results). SO I think we are in the position of moving forward to the next steps. I think that just means checking in the rest of the vimage tree from what I have seen. Then we can play with it a bit and then proceed to the jail/vimage merge stuff that Jamie (and bz) are working on. One thing I'd like to do is make the following changes: 1/ evaluate the ordering of teh items in the vimage structures to see if there are items that should be clusterred for cache reasons. 2/ remove all sub structures from the vimage structures and replace them with pointers. This is because puting them in directly in the vimage structures will make our lives harder due to ABI issues. If they are independently allocated (*) then we don't need to worry about them changing in size. (*) actually they could still be allocated as a blob but we would access them as if they are separate. comments? Julian From julian at elischer.org Thu Feb 19 15:26:04 2009 From: julian at elischer.org (Julian Elischer) Date: Thu Feb 19 15:26:11 2009 Subject: Vimage next step In-Reply-To: <499DC0F7.10709@elischer.org> References: <499DC0F7.10709@elischer.org> Message-ID: <499DEA9D.70105@elischer.org> Julian Elischer wrote: > I've been doing performance testing on the 'non-vimage' 'structified' > case VS the original 'globals' case and have not been able to see any > really significant differences (though I have seen very slight > differences in the distribution of results). > > SO I think we are in the position of moving forward to the next steps. > > I think that just means checking in the rest of the vimage tree > from what I have seen. > > Then we can play with it a bit and then proceed to the > jail/vimage merge stuff that Jamie (and bz) are working on. > > One thing I'd like to do is make the following changes: > > 1/ evaluate the ordering of the items in the vimage structures to see if > there are items that should be clustered for cache reasons. > > 2/ remove all sub structures from the vimage structures > and replace them with pointers. This is because puting them in > directly in the vimage structures will make our lives harder due to ABI > issues. If they are independently allocated (*) then we don't need to > worry about them changing in size. for example, the ipsec struct starts with: int _ipsec_debug; struct ipsecstat _ipsec4stat; struct secpolicy _ip4_def_policy; int _ip4_esp_trans_deflev; int _ip4_esp_net_deflev; This effectively fixes the size of the ipsecstat and secpolicy structures. I would like instead to have: int _ipsec_debug; struct ipsecstat *_ipsec4stat; struct secpolicy *_ip4_def_policy; int _ip4_esp_trans_deflev; int _ip4_esp_net_deflev; and have the initializer function allocate those separately. > > > (*) actually they could still be allocated as a blob but we would access > them as if they are separate. > > comments? > > Julian > > > > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to > "freebsd-virtualization-unsubscribe@freebsd.org" From zec at freebsd.org Fri Feb 20 02:07:37 2009 From: zec at freebsd.org (Marko Zec) Date: Fri Feb 20 02:07:44 2009 Subject: Vimage next step In-Reply-To: <499DEA9D.70105@elischer.org> References: <499DC0F7.10709@elischer.org> <499DEA9D.70105@elischer.org> Message-ID: <200902201054.35147.zec@freebsd.org> On Friday 20 February 2009 00:26:21 Julian Elischer wrote: > Julian Elischer wrote: > > I've been doing performance testing on the 'non-vimage' > > 'structified' case VS the original 'globals' case and have not been > > able to see any really significant differences (though I have seen > > very slight differences in the distribution of results). > > > > SO I think we are in the position of moving forward to the next > > steps. > > > > I think that just means checking in the rest of the vimage tree > > from what I have seen. > > > > Then we can play with it a bit and then proceed to the > > jail/vimage merge stuff that Jamie (and bz) are working on. > > > > One thing I'd like to do is make the following changes: > > > > 1/ evaluate the ordering of the items in the vimage structures to > > see if there are items that should be clustered for cache reasons. > > > > 2/ remove all sub structures from the vimage structures > > and replace them with pointers. This is because puting them in > > directly in the vimage structures will make our lives harder due to > > ABI issues. If they are independently allocated (*) then we don't > > need to worry about them changing in size. > > for example, the ipsec struct starts with: > > int _ipsec_debug; > struct ipsecstat _ipsec4stat; > struct secpolicy _ip4_def_policy; > > int _ip4_esp_trans_deflev; > int _ip4_esp_net_deflev; > > This effectively fixes the size of the ipsecstat and secpolicy > structures. I would like instead to have: > int _ipsec_debug; > struct ipsecstat *_ipsec4stat; > struct secpolicy *_ip4_def_policy; > > int _ip4_esp_trans_deflev; > int _ip4_esp_net_deflev; > > and have the initializer function allocate those separately. > > > (*) actually they could still be allocated as a blob but we would > > access them as if they are separate. > > > > comments? I'm working on distilling the initializer functions from vimage to vimage-commit2 branch, and this will probably be a cause of enough controversy for itself, so that I'd propose to wait a few more days before engaging in other changes / optimizations. With my pointy-hat for losing lots of time on, I still think that so far we have been more or less successfull in merging bits over to svn without causing major damage to other people, so perhaps it might sense to continue one step at the time. Of course, I agree that reducing the impact of changes in random structures to the layout of vnet_* containers by using pointers instead of embedded structs would certainly be an improvement in ABI maintenance terms, but we shouldn't loose the performance perspective in mind, given that we've already introduced an additional level of indirection. Ordering of items in vnet_ containers is definitely something that could and should be played with sooner than later, though I can't / won't promise to spend significant amount of time on that in the next few weeks :) Marko