concurrent sysctl implementation

Mon May 11 22:14:27 UTC 2009

On Monday 11 May 2009 2:27:48 pm jt at 0xabadba.be wrote:
> John,
> 
>      Thank you for your input on this matter, I'm excited to write
> some software for this project since its given me great code to learn
> from as i've grown up (still a kid though :).  My questions are a bit
> more detailed below.
> 
> On Mon, May 11, 2009 at 12:24 PM, John Baldwin <jhb at freebsd.org> wrote:
> > On Friday 08 May 2009 5:41:17 pm Ed Schouten wrote:
> >> A solution would be to solve it as follows:
> >>
> >> - Use a semaphore, initialized to some insane high value to put an upper
> >>   limit on the amount of concurrent sysctl calls. I'm not sure whether
> >>   this is really needed. Maybe this issue is not as serious as we think
> >>   it is.
> >
> > Well, one compromise might be to allow concurrent userland requests if the
> > buffer size is small (say < 1 page).  This would be a quite simple change 
and
> > would cover many common syscalls like fetching an int which don't wire 
memory
> > anyway.
> 
> Why is this a compromise?  Isn't concurrent sysctl calls from userland
> a good thing?  What materials would be good to read other than the
> code and the sysctl manual pages?  You said it would be relatively
> easy to implement this; what methods should I be considering to do
> this in and what part of the code specifically should I be looking at?

Well, in theory a bunch of "small" requests to SYSCTL_PROC() nodes that used 
sysctl_wire_old() (or whatever it is called) could cause the amount of user 
memory wired for sysctls to grow unbounded.  Thus, allowing this limited 
concurrency is a tradeoff as there is a minimal (perhaps only theoretical at 
the moment) risk of removing the safety net.

The patch is quite small, btw, because the locking for the sysctl tree already 
exists, and by using read locks, one can already allow concurrent sysctl 
requests.  There is no need to add any new locks or restructure the sysctl 
tree, just to adjust the locking that is already present.  It might be 
clearer, in fact, to split the sysctl memory lock back out into a separate 
lock.  This would allow "small" sysctl requests to run concurrently with a 
single "large" request whereas in my suggestion in the earlier e-mail, 
the "large" request will block all other user requests until it finishes.

I've actually gone ahead and done this below.

--- //depot/projects/smpng/sys/kern/kern_sysctl.c	2009/05/08 11:53:25
+++ //depot/user/jhb/lock/kern/kern_sysctl.c	2009/05/11 21:58:08
@@ -77,11 +77,12 @@
  * API rather than using the dynamic API.  Use of the dynamic API is
  * strongly encouraged for most code.
  *
- * This lock is also used to serialize userland sysctl requests.  Some
- * sysctls wire user memory, and serializing the requests limits the
- * amount of wired user memory in use.
+ * The sysctlmemlock is used to limit the amount of user memory wired for
+ * sysctl requests.  This is implemented by serializing any userland
+ * sysctl requests larger than a single page via an exclusive lock.
  */
 static struct sx sysctllock;
+static struct sx sysctlmemlock;
 
 #define	SYSCTL_SLOCK()		sx_slock(&sysctllock)
 #define	SYSCTL_SUNLOCK()	sx_sunlock(&sysctllock)
@@ -543,6 +544,7 @@
 {
 	struct sysctl_oid **oidp;
 
+	sx_init(&sysctlmemlock, "sysctl mem");
 	SYSCTL_INIT();
 	SYSCTL_XLOCK();
 	SET_FOREACH(oidp, sysctl_set)
@@ -1563,7 +1565,7 @@
     size_t *oldlenp, int inkernel, void *new, size_t newlen, size_t *retval,
     int flags)
 {
-	int error = 0;
+	int error = 0, memlocked;
 	struct sysctl_req req;
 
 	bzero(&req, sizeof req);
@@ -1603,14 +1605,20 @@
 	if (KTRPOINT(curthread, KTR_SYSCTL))
 		ktrsysctl(name, namelen);
 #endif
-	
-	SYSCTL_XLOCK();
+
+	if (req.oldlen > PAGE_SIZE) {
+		memlocked = 1;
+		sx_xlock(&sysctlmemlock);
+	} else
+		memlocked = 0;
 	CURVNET_SET(TD_TO_VNET(curthread));
 
 	for (;;) {
 		req.oldidx = 0;
 		req.newidx = 0;
+		SYSCTL_SLOCK();
 		error = sysctl_root(0, name, namelen, &req);
+		SYSCTL_SUNLOCK();
 		if (error != EAGAIN)
 			break;
 		uio_yield();
@@ -1620,7 +1628,8 @@
 
 	if (req.lock == REQ_WIRED && req.validlen > 0)
 		vsunlock(req.oldptr, req.validlen);
-	SYSCTL_XUNLOCK();
+	if (memlocked)
+		sx_xunlock(&sysctlmemlock);
 
 	if (error && error != ENOMEM)
 		return (error);

-- 
John Baldwin