PERFORCE change 122803 for review

Ulf Lilleengen lulf at FreeBSD.org
Tue Jul 3 21:23:07 UTC 2007


http://perforce.freebsd.org/chv.cgi?CH=122803

Change 122803 by lulf at lulf_carrot on 2007/07/03 21:22:13

	- Clean up code and remove debug printfs.
	- Add initial testplan draft. I also have some testscripts that will be
	  comitted when they're ready.
	- Add my scratchpad TODO.

Affected files ...

.. //depot/projects/soc2007/lulf/TESTPLAN#1 add
.. //depot/projects/soc2007/lulf/TODO#3 edit
.. //depot/projects/soc2007/lulf/gvinum_fixup/sbin/gvinum/gvinum.c#11 edit
.. //depot/projects/soc2007/lulf/gvinum_fixup/sys/geom/vinum/geom_vinum_create.c#2 edit

Differences ...

==== //depot/projects/soc2007/lulf/TODO#3 (text+ko) ====

@@ -8,3 +8,148 @@
 6. Make sure other parts is function correctly, and implement what perhaps is
 not implemented yet.
 7. Run some tests to make sure the new gvinum code-base is good enough.
+
+
+### MY internal scratchpad. This might be outdated, but I'm trying to mark the
+things I've finished with DONE.
+
+1. Find out if we can support adding subdisks to a raid5 plex.
+
+2. Need a way to tell if we're adding a subdisk to a degraded raid5 plex or not.
+This means we only add subdisks to a raid5 plex if the original size is larger
+than the size during attachment. One way is to have a original size of the plex.
+Another way is to allow a subdisk to attach itself to a plex _if_ the plex is
+degraded and all other subdisks are up. The best would probably be to have a
+pled->origsize field to keep the original size of the plex intact. This could
+expand if we later want to expand a raid5 array. DONE.
+
+3. Must make sure that we don't use a subdisks plex if it's detached (same with
+plex and volume). NO
+
+4. Make sure gv_plex_size uses the smallest stripe... makes it possible to have
+unequal sized subdisks, but only parts of it is used... ?NO
+
+5. To support mounted raid5 rebuild... we need to check if both rebuild and
+normal request BIO's can work side by side... As long as it is in the degraded
+state perhaps?
+
+6. To get syncing working... we need a way to read BIOs from one plex and write
+them to another plex. One way of doing this is that we issue a read-bio for one
+plex, and when that bio returns, it will issue a BIO with the data to be written
+to a destination plex. Either, issue all reads at once, or issue one read, and
+let the 'done'-routines handle issueing of the next bios. 
+In both ways, we need a way to tell the destination of a write, or where the
+original data came from. DONE
+
+Todo:
+* Need to check if it works to use bio_caller1. Doesn't, its used, but
+  bio_caller2 works... for now
+* Try to make send_parity work for sync as well. NO, created another one.
+* Restructure gv_sync_completed and split it a bit up to create a better
+  abstraction. DONE
+* Test it. DONE
+* Make sure we don't have write between syncing.
+	Krav:
+	 - Skal kunne lese av mirror mens mountet.
+	 - Skal kunne skrive til mirror mens mountet.
+	 - MÃ¥ delaye writes.
+	 - Reads som kommer når vi har noe i queuen må også delayes.
+	
+ 	To achieve this, we can - for each BIO that comes down to a volume -
+	check that if one plex is syncing, we put the BIO in a wait-queue that
+	is only processed when NO plexes are in GV_PLEX_SYNCING state. 
+
+7. Investigate problem with setting plex state to up even with forcing. Fixed.
+8. Check why panic when using mirror. 
+	It's because a BIO is shared between all plexes for a write. Se my
+	comment in the hack in gv_bio_done for more explanation.
+9. Check weird states showing up when doing rebuild/sync etc and prevent from
+config being listed up all the time. Config not listed anymore.
+10. Make sure we can use bio_caller2 in sync.
+11. Change event structure to include non-pointer parameters for flags etc.
+12. Since GV_SD_INITIALIZING is used by gv_sync and gv_rebuild, we should
+perhaps update s->initialized, or create a new state called syncing or
+something. Because gv_list gives a weird output when s->initialized is not used.
+Perhaps have another state.
+
+13. Don't crash when we try rebuild on a up plex with only down/stale subdisks.
+Perhaps it's okay since we're forcing the state anyway and the user literally
+asks for it.
+
+14. When a raid5 plex is created, the subdisks are stale and the plex is down,
+but it shouldn't. When a mirror is created, the second one is stale, but it
+shouldn't. 
+
+15. Implement initialization. Done
+
+16. Initially, parity is wrong... so must remember to rebuild, but should this
+be necessary? No, init is used
+
+
+* What may this problem be? Me using bio_caller1? No, we don't have enough bits
+* for our flags :) fixed.
+* Got a recursing lock issue. Fixed.
+
+* Syncing problem. Lukas proposed:
+> 
+> Well, that's even easier than the RAID5 rebuild problem: reads are only
+> served from the "good" plex, writes go through to all plexes, as the new
+> data would be written anyway (after a read from the "good" plex).  Some
+> locking would be advisable, though.
+
+The flaw with this is that if a write is done on a plex, it can be overwritten
+by a sync write, so it will have to wait for the sync to be done, and then
+overwrite the sync. We could perhaps examine if the offset is greater than
+current synced offset, and then we again have to make sure all plexes are beyond
+that offset.
+
+Suggestion for handling mounted rebuilds:
+Imagine this situation: you have rebuilt the parity up to offset X of
+the RAID5 plex.  I/O that requests something from below X can go through
+normally, I/O that requests something beyond X needs to run like in
+degraded mode.  The trick is to get the locking right, so that the
+rebuild process and the normal I/O don't interfere when they are around
+the some offset.
+
+* It's already checked that if a plex is syncing and synced > the offset +
+  length, then the request can be served.
+* Otherwise, the request will be served like a degraded or no-parity. The
+  problem is that if NOPARITY or REBUILD is detected (begause s->state is not
+  UP), data will be written without parity, and we'd have to rebuild parity
+  or write the data afterwards.
+
+15. Must test and debug! Check why hanging while rebuilding and copying at the
+same time (initiating copy first)
+
+* Bug #1: When trying to newfs on a degraded volume where the _first_ subdisk is
+  down, i get cg 0: bad magic number. Hmmm....
+  Actually, it's a bug in RAID5 degraded write/read code. We got a data
+  corruption bug. Fixed, we forgot to write correct parity data.
+
+
+* Check if we need to check for REBUILD flag when syncing.
+
+ISSUES:
+* We don't but delayed requests due to rebuild on the global queue, since
+	A) They shouldn't be run by all plexes.
+	B) They were issued before new I/Os, so they should be sendt down before
+	   them.
+* Delayed requests due to REBUILD currently use it's own BIO queue. Perhaps we
+  should rework the way plex queues are used.
+
+18. Rewrite rename, move. DONE.
+
+
+19. Concat/mirror/stripe
+	* Right now, we ask the kernel for each drive. Another way could be to
+	  tell the kernel to send the whole config, and create parse routines
+	  for it.
+
+ * Fix padding in gv_volume for userland. DONE
+
+20. Make sync/rebuild requests also get delayed due to rebuild requests?
+
+20. When to trondheim. Write papers for status report.
+
+21. Updateman-page
+

==== //depot/projects/soc2007/lulf/gvinum_fixup/sbin/gvinum/gvinum.c#11 (text+ko) ====

@@ -375,7 +375,6 @@
 gvinum_concat(int argc, char **argv)
 {
 
-	printf("Preparing\n");
 	if (argc < 2) {
 		warnx("usage:\tconcat [-fv] [-n name] drives\n");
 		return;
@@ -396,7 +395,6 @@
 
 	plexes = subdisks = volumes = 0;
 	drives = 1;
-	printf("Trying to create drive on %s\n", device);
 
 	/* Strip away eventual /dev/ in front. */
 	if (strncmp(device, "/dev/", 5) == 0)
@@ -406,7 +404,6 @@
 	if (drivename == NULL)
 		return (NULL);
 
-	fprintf(stderr, "Drivename %s is okay\n", drivename);
 	req = gctl_get_handle();
 	gctl_ro_param(req, "class", -1, "VINUM");
 	gctl_ro_param(req, "verb", -1, "create");
@@ -422,12 +419,10 @@
 	gctl_ro_param(req, "volumes", sizeof(int), &volumes);
 	gctl_ro_param(req, "plexes", sizeof(int), &plexes);
 	gctl_ro_param(req, "subdisks", sizeof(int), &subdisks);
-	fprintf(stderr, "Creating drive request sent to kernel...\n");
 	errstr = gctl_issue(req);
 	if (errstr != NULL)
 		warnx("error creating drive: %s", errstr);
 	gctl_free(req);
-	printf("Done creating drive %s on %s\n", drivename, device);
 	return (drivename);
 }
 
@@ -474,11 +469,8 @@
 	}
 
 	/* Find a free volume name. */
-	if (volname == NULL) {
-		fprintf(stderr, "Finding name\n");
+	if (volname == NULL)
 		volname = find_name("gvinumvolume", GV_TYPE_VOL, GV_MAXVOLNAME);
-		fprintf(stderr, "Found name %s\n", volname);
-	}
 
 	/* Then we send a request to actually create the volumes. */
 	gctl_ro_param(req, "verb", -1, verb);
@@ -523,7 +515,6 @@
 	comment[0] = '\0';
 
 	/* Find a name. Fetch out configuration first. */
-	printf("Fetching configuration\n");
 	req = gctl_get_handle();
 	gctl_ro_param(req, "class", -1, "VINUM");
 	gctl_ro_param(req, "verb", -1, "getconfig");
@@ -536,7 +527,6 @@
 	}
 	gctl_free(req);
 
-	printf("Got configuration:\n");
 	printf(buf);
 	begin = 0;
 	len = strlen(buf);
@@ -553,7 +543,6 @@
 			if (buf[i] == '\n' || buf[i] == '\0') {
 				ptr = buf + begin;
 				strlcpy(line, ptr, (i - begin) + 1);
-				printf("Processing line: .%s.\n", line);
 				begin = i + 1;
 				switch (type) {
 				case GV_TYPE_DRIVE:
@@ -572,9 +561,7 @@
 				}
 				if (name == NULL)
 					continue;
-				printf("Found a name: .%s.\n", name);
 				if (!strcmp(sname, name)) {
-					printf("Conflicts, try next\n");
 					conflict = 1;
 					/* XXX: Could quit the loop earlier. */
 				}
@@ -928,45 +915,6 @@
 	if (errstr)
 		warnx("%s\n", errstr);
 	gctl_free(req);
-/*	do {
-		rv = 0;
-		req = gctl_get_handle();
-		gctl_ro_param(req, "class", -1, "VINUM");
-		gctl_ro_param(req, "verb", -1, "parityop");
-		gctl_ro_param(req, "flags", sizeof(int), &flags);
-		gctl_ro_param(req, "rebuild", sizeof(int), &rebuild);
-		gctl_rw_param(req, "rv", sizeof(int), &rv);
-		gctl_rw_param(req, "offset", sizeof(off_t), &offset);
-		gctl_ro_param(req, "plex", -1, argv[0]);
-		errstr = gctl_issue(req);
-		if (errstr) {
-			warnx("%s\n", errstr);
-			gctl_free(req);
-			break;
-		}
-		gctl_free(req);
-		if (flags & GV_FLAG_V) {
-			printf("\r%s at %s ... ", msg,
-			    gv_roughlength(offset, 1));
-		}
-		if (rv == 1) {
-			printf("Parity incorrect at offset 0x%jx\n",
-			    (intmax_t)offset);
-			if (!rebuild)
-				break;
-		}
-		fflush(stdout);
-
-		 Clear the -f flag. 
-		flags &= ~GV_FLAG_F;
-	} while (rv >= 0);*/
-
-/*	if ((rv == 2) && (flags & GV_FLAG_V)) {
-		if (rebuild)
-			printf("Rebuilt parity on %s\n", argv[0]);
-		else
-			printf("%s has correct parity\n", argv[0]);
-	}*/
 }
 
 void

==== //depot/projects/soc2007/lulf/gvinum_fixup/sys/geom/vinum/geom_vinum_create.c#2 (text+ko) ====

@@ -54,18 +54,15 @@
 
 	sc = gp->softc;
 	dcount = 0;
-	printf("Starting to create concat device\n");
 	vol = gctl_get_param(req, "name", NULL);
 	if (vol == NULL) {
 		gctl_error(req, "volume's not given");	
 		return;
 	}
-	printf("Got first parameter name: %s\n", vol);
 
 	flags = gctl_get_paraml(req, "flags", sizeof(*flags));
 	drives = gctl_get_paraml(req, "drives", sizeof(*drives));
 
-	printf("Got both flags and drives\n");
 	if (drives == NULL) { 
 		gctl_error(req, "drives not given");
 		return;
@@ -85,18 +82,11 @@
 	p->stripesize = 0;
 	gv_post_event(sc, GV_EVENT_CREATE_PLEX, p, NULL, 0, 0);
 
-	/* XXX: We'll might have a problem making sure all of these things exists
-	 * when we come into the event loop. */
-
-	printf("We have created the volume and plex. Time to create subdisks\n");
-	/* Drives are first (only right now) priority */
+	/* Drives are first (right now) priority */
 	for (dcount = 0; dcount < *drives; dcount++) {
 		snprintf(buf, sizeof(buf), "drive%d", dcount);
 		drive = gctl_get_param(req, buf, NULL);
-		printf("We're looking for drive %s\n", drive);
 		d = gv_find_drive(sc, drive);
-		printf("Creating subdisk for drive %s on device %s\n", d->name,
-		    d->device);
 		if (d == NULL) {
 			gctl_error(req, "No such drive '%s'", drive);
 			continue;
@@ -110,6 +100,7 @@
 		s->size = -1;
 		gv_post_event(sc, GV_EVENT_CREATE_SD, s, NULL, 0, 0);
 	}
+	gv_post_event(sc, GV_EVENT_SAVE_CONFIG, sc, NULL, 0, 0);
 }
 
 /*
@@ -130,26 +121,21 @@
 	dcount = 0;
 	scount = 0;
 	pcount = 0;
-	printf("Starting to create mirror device\n");
 	vol = gctl_get_param(req, "name", NULL);
 	if (vol == NULL) {
 		gctl_error(req, "volume's not given");	
 		return;
 	}
-	printf("Got first parameter name: %s\n", vol);
 
 	flags = gctl_get_paraml(req, "flags", sizeof(*flags));
 	drives = gctl_get_paraml(req, "drives", sizeof(*drives));
 
-	printf("Got both flags and drives\n");
-	/* XXX: Remove when we have support for drivegroups. */
 	if (drives == NULL) { 
 		gctl_error(req, "drives not given");
 		return;
 	}
 
 	/* We must have an even number of drives. */
-	/* XXX: We need supported for striped mirror as well. */
 	if (*drives % 2 != 0) {
 		gctl_error(req, "must have an even number of drives");
 		return;
@@ -186,10 +172,7 @@
 		for (dcount = pcount; dcount < *drives; dcount += 2) {
 			snprintf(buf, sizeof(buf), "drive%d", dcount);
 			drive = gctl_get_param(req, buf, NULL);
-			printf("We're looking for drive %s\n", drive);
 			d = gv_find_drive(sc, drive);
-			printf("Creating subdisk for drive %s on device %s\n",
-			    d->name, d->device);
 			if (d == NULL) {
 				gctl_error(req, "No such drive '%s'", drive);
 				/* XXX: Should we fail instead? */
@@ -208,6 +191,7 @@
 			scount++;
 		}
 	}
+	gv_post_event(sc, GV_EVENT_SAVE_CONFIG, sc, NULL, 0, 0);
 }
 
 /*
@@ -227,7 +211,6 @@
 	sc = gp->softc;
 	dcount = 0;
 	pcount = 0;
-	printf("Starting to create striped device\n");
 	vol = gctl_get_param(req, "name", NULL);
 	if (vol == NULL) {
 		gctl_error(req, "volume's not given");	
@@ -236,8 +219,6 @@
 	flags = gctl_get_paraml(req, "flags", sizeof(*flags));
 	drives = gctl_get_paraml(req, "drives", sizeof(*drives));
 
-	printf("Got both flags and drives\n");
-	/* XXX: Remove when we have support for drivegroups. */
 	if (drives == NULL) { 
 		gctl_error(req, "drives not given");
 		return;
@@ -268,10 +249,7 @@
 	for (dcount = 0; dcount < *drives; dcount++) {
 		snprintf(buf, sizeof(buf), "drive%d", dcount);
 		drive = gctl_get_param(req, buf, NULL);
-		printf("We're looking for drive %s\n", drive);
 		d = gv_find_drive(sc, drive);
-		printf("Creating subdisk for drive %s on device %s\n", d->name,
-		    d->device);
 		if (d == NULL) {
 			gctl_error(req, "No such drive '%s'", drive);
 			continue;
@@ -285,4 +263,5 @@
 		s->size = -1;
 		gv_post_event(sc, GV_EVENT_CREATE_SD, s, NULL, 0, 0);
 	}
+	gv_post_event(sc, GV_EVENT_SAVE_CONFIG, sc, NULL, 0, 0);
 }


More information about the p4-projects mailing list