From bugmaster at FreeBSD.org Mon Dec 1 03:07:02 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 1 03:09:08 2008 Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org Message-ID: <200812011107.mB1B721f052677@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/126866 scsi [isp] [panic] kernel panic on card initialization o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/38828 scsi [dpt] [request] DPT PM2012B/90 doesn't work o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 29 problems total. From jh at saunalahti.fi Wed Dec 3 07:51:53 2008 From: jh at saunalahti.fi (Jaakko Heinonen) Date: Wed Dec 3 07:52:15 2008 Subject: kern/88823: [modules] [atapicam] atapicam - kernel trap 12 on loading and unloading Message-ID: <20081203153258.GA3249@a91-153-125-115.elisa-laajakaista.fi> Hi, There is a CAM(4)/pass(4) bug which causes passcleanup() (in sys/cam/scsi/scsi_pass.c) to call destroy_dev(9) with the device mutex held. It's not allowed to call destroy_dev() with sleepable locks held. Here's the call trace: destroy_dev(c7b28400,0,c569f754,c7b15080,f46c6a38,...) at destroy_dev+0x10 passcleanup(c7b15080,c0b8f83b,c0bdf975,c585d058,c0e5afe0,...) at passcleanup+0x2e camperiphfree(c7b15080,0,f46c6a58,c0477b7d,c7b15080,...) at camperiphfree+0xc2 cam_periph_invalidate(c7b15080,c59328d0,f46c6a8c,c0492b4a,c7b15080,...) at cam_periph_invalidate+0x3e cam_periph_async(c7b15080,100,f46c6b18,0,0,...) at cam_periph_async+0x2d passasync(c7b15080,100,f46c6b18,0,c7ae0000,...) at passasync+0xca xpt_async_bcast(0,4,c0b6dbbf,11a5,c7b428c0,...) at xpt_async_bcast+0x32 xpt_async(100,f46c6b18,0,10,c575ccb8,...) at xpt_async+0x194 xpt_bus_deregister(0,0,c7b75b30,378,c577fc00,...) at xpt_bus_deregister+0x4e free_softc(c577fe64,0,c7b75b30,103,c7b18100,...) at free_softc+0xe1 atapi_cam_detach(c7b18100,c7b30858,c0caa340,9a4,1,...) at atapi_cam_detach+0x7f device_detach(c7b18100,c081bf09,c7691840,1,c7b760f8,...) at device_detach+0x8c devclass_delete_driver(c554b6c0,c7b7610c,c0bd0dfd,2d,0,...) at devclass_delete_driver+0x91 driver_module_handler(c7692040,1,c7b760f8,ef,c7692040,...) at driver_module_handler+0xdf module_unload(c7692040,0,253,250,f46c6c40,...) at module_unload+0x75 linker_file_unload(c7ab9d00,0,c0bcf326,400,c7b73000,...) at linker_file_unload+0xc9 kern_kldunload(c5957460,6,0,f46c6d2c,c0b11ff3,...) at kern_kldunload+0xd5 kldunloadf(c5957460,f46c6cf8,8,c0bd96d0,c0cad660,...) at kldunloadf+0x2b Calling xpt_bus_deregister() in atapicam results this code path. xpt_bus_deregister() must be called with the device mutex held. Following change fixes the atapicam problem; however the patch may be incorrect because I am not sure if passcleanup() is always called with the lock held. I have tried the patch with atapicam(4) and umass(4) (both use pass(4)). %%% Index: sys/cam/scsi/scsi_pass.c =================================================================== --- sys/cam/scsi/scsi_pass.c (revision 185331) +++ sys/cam/scsi/scsi_pass.c (working copy) @@ -167,7 +167,9 @@ passcleanup(struct cam_periph *periph) devstat_remove_entry(softc->device_stats); + mtx_unlock(periph->sim->mtx); destroy_dev(softc->dev); + mtx_lock(periph->sim->mtx); if (bootverbose) { xpt_print(periph->path, "removing device entry\n"); %%% There are also other bugs involved in unloading the atapicam module. * If there are pending hcbs kernel will panic on unload. There's an obvious bug in free_softc(): it uses TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE(). However fixing that is not enough. There are additional problem(s) and I don't have a fix for them. Here's a patch that changes it to refuse to detach if there are pending hcbs: %%% Index: sys/dev/ata/atapi-cam.c =================================================================== --- sys/dev/ata/atapi-cam.c (revision 185519) +++ sys/dev/ata/atapi-cam.c (working copy) @@ -254,6 +254,13 @@ atapi_cam_detach(device_t dev) struct atapi_xpt_softc *scp = device_get_softc(dev); mtx_lock(&scp->state_lock); + /* + * XXX: Detaching when pending hcbs exist is broken. + */ + if (!TAILQ_EMPTY(&scp->pending_hcbs)) { + mtx_unlock(&scp->state_lock); + return (EBUSY); + } xpt_freeze_simq(scp->sim, 1 /*count*/); scp->flags |= DETACHING; mtx_unlock(&scp->state_lock); @@ -882,11 +889,11 @@ free_hcb(struct atapi_hcb *hcb) static void free_softc(struct atapi_xpt_softc *scp) { - struct atapi_hcb *hcb; + struct atapi_hcb *hcb, *tmp_hcb; if (scp != NULL) { mtx_lock(&scp->state_lock); - TAILQ_FOREACH(hcb, &scp->pending_hcbs, chain) { + TAILQ_FOREACH_SAFE(hcb, &scp->pending_hcbs, chain, tmp_hcb) { free_hcb_and_ccb_done(hcb, CAM_UNREC_HBA_ERROR); } if (scp->path != NULL) { %%% * cd(4) doesn't tolerate well disappearing devices. There's code in cdinvalidate() to invalidate further I/O operations but calling for example d_close causes a crash. Thus you can't unmount a file system after the device has disappeared. This patch makes it to survive unmounting. %%% Index: sys/cam/scsi/scsi_cd.c =================================================================== --- sys/cam/scsi/scsi_cd.c (revision 185331) +++ sys/cam/scsi/scsi_cd.c (working copy) @@ -382,6 +382,9 @@ cdoninvalidate(struct cam_periph *periph camq_remove(&softc->changer->devq, softc->pinfo.index); disk_gone(softc->disk); + softc->disk->d_drv1 = NULL; + softc->disk->d_close = NULL; /* allow closing the disk */ + xpt_print(periph->path, "lost device\n"); } %%% -- Jaakko From jpaetzel at FreeBSD.org Thu Dec 4 18:50:47 2008 From: jpaetzel at FreeBSD.org (Josh Paetzel) Date: Thu Dec 4 18:50:53 2008 Subject: Problem with RAID1 Disk on Freebsd In-Reply-To: <124704c40811270052s1d215d24kc7b057da17a1cb83@mail.gmail.com> References: <124704c40811270052s1d215d24kc7b057da17a1cb83@mail.gmail.com> Message-ID: <493892AE.8060607@FreeBSD.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 shakeb ainul wrote: > Hi, > > I am a developer with one of the top IT companies in Asia. I have the > following problem with my RAID 1 server. > > Last week, the only disk of my server failed due to unknown reason and it > required a reboot of the server. Following error messages were logged in: > > /var/log/messages > > Nov 21 02:18:17 server1 kernel: ciss0: *** SCSI bus speed downshifted, SCSI > port 2 > Nov 21 02:20:58 server1 kernel: ciss0: *** SCSI bus speed downshifted, SCSI > port 2 > Nov 21 02:22:37 server1 kernel: ciss0: *** SCSI bus speed downshifted, SCSI > port 2 > Nov 21 02:31:01 server1 kernel: ciss0: *** Physical drive failure: SCSI port > 2 ID 1 > Nov 21 02:31:01 server1 kernel: ciss0: *** State change, logical drive 0 > Nov 21 02:31:01 server1 kernel: ciss0: logical drive 0 (da0) changed status > OK->interim recovery, spare status 0x0 > > Attached is the dmesg.boot file of my server. > > Please advise on what could be the possible causes for this fault and what > can we do to ensure it does not happen again in future. > > Thanks in anticipation. > > Regards, > SHAKEB AINUL - From what I can tell based on the information you've provided a drive failed in such a way that the controller tried stepping the bus down. When that didn't help it faulted the drive out of the array. But you don't say, and it doesn't seem discernable from the information you provided, what happened after that. Did the server hang or reboot or something? - -- Thanks, Josh Paetzel PGP: 8A48 EF36 5E9F 4EDA 5ABC 11B4 26F9 01F1 27AF AECB -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkk4kq4ACgkQJvkB8SevrsurUACeLxHdjvTxdo6IMxJrAbvJh9mK 63cAnR2iW1eOKCv8bEV77VeKM8oCSumh =GJe2 -----END PGP SIGNATURE----- From gjb at semihalf.com Fri Dec 5 02:30:06 2008 From: gjb at semihalf.com (Grzegorz Bernacki) Date: Fri Dec 5 02:30:12 2008 Subject: USB stick probing Message-ID: <4939002E.9070001@semihalf.com> Hi, We have a problem with discovering USB stick. We see following output after inserting the stick: (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): Retrying Command (da0:umass-sim0:0:0:0): error 6 (da0:umass-sim0:0:0:0): Unretryable Error da0 at umass-sim0 bus 0 target 0 lun 0 da0: Removable Direct Access SCSI-2 device da0: 40.000MB/s transfers da0: Attempt to query device size failed: UNIT ATTENTION, Not ready to ready change I turned on some debugs to see which exact command fails and this is how sequence of commands looks like: - INQUIRY - INQUIRY - TEST UNIT READY which fails with Not Ready - READ CAPACITY which fails a few times and then we get error. My knowlegde of CAM?XPT is very limited so I got some questions regarding probing devices by CAM/XPT: 1) Test unit ready is sent in PROBE_TUR_FOR_NEGOTIATION state. But we don't check for SCSI errors. So if TEST UNIT READY fails we just go on without retrying this command to the next state. Why we don't care if device is ready or not? Shouldn't we check the errors and retry the command? Are there any reason why we skip error checking? 2) After USB stick is inserted we start from PROBE_INQUIRY state. Is it expected behaviour? I thought we should start from PROBE_TUR. I going to check the status of command in PROBE_TUR_FOR_NEGOTIATION and retry command if it fails. Is it good solution? Maybe it can be solved in other easier way. Thanks in advance, Grzesiek From morganw at chemikals.org Sun Dec 7 11:46:18 2008 From: morganw at chemikals.org (Wes Morgan) Date: Sun Dec 7 11:46:28 2008 Subject: CAM documentation Message-ID: I'm looking at porting a linux program that uses their generic cdrom routines to basically pass through commands to the cd. From what I gather, the way to do this in FreeBSD is using CAM and possibly ATAPICAM. However, I can't find much documentation on the CAM system other than the man pages and the code for camcontrol. Is there any in-depth API documentation available? Thanks! From chuck at tuffli.net Sun Dec 7 20:29:07 2008 From: chuck at tuffli.net (Chuck Tuffli) Date: Sun Dec 7 20:29:16 2008 Subject: CAM documentation In-Reply-To: References: Message-ID: On Dec 7, 2008, at 11:45 AM, Wes Morgan wrote: > I'm looking at porting a linux program that uses their generic > cdrom routines to basically pass through commands to the cd. From > what I gather, the way to do this in FreeBSD is using CAM and > possibly ATAPICAM. However, I can't find much documentation on the > CAM system other than the man pages and the code for camcontrol. Is > there any in-depth API documentation available? http://www.freebsd.org/doc/en_US.ISO8859-1/books/arch-handbook/scsi.html From bugmaster at FreeBSD.org Mon Dec 8 03:07:02 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 8 03:09:00 2008 Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org Message-ID: <200812081107.mB8B72aV014390@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/126866 scsi [isp] [panic] kernel panic on card initialization o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/38828 scsi [dpt] [request] DPT PM2012B/90 doesn't work o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 29 problems total. From bra at fsn.hu Thu Dec 11 05:38:07 2008 From: bra at fsn.hu (Attila Nagy) Date: Thu Dec 11 05:38:14 2008 Subject: FreeBSD -CURRENT regression (only 3.3 MBps on ahc) Message-ID: <49411408.2070809@fsn.hu> Hello, I have a server, which has two SCSI controllers, one (two channels) for the inner disks and one for a directly attached storage (a Promise RM8000 box, which has 8 ATA drives). Because I use ZFS on the RM8000, I've switched to -CURRENT some time ago and noticed that the array's speed fell sharply. Until now I didn't have the time to investigate it further, but now I found out the following: - in the controller's BIOS the device is recognized as U160 (16 bit wide, 80 MHz) - in dmesg the device can be seen as: ahc2: port 0x2800-0x28ff mem 0xfea90000-0xfea90fff irq 31 at device 8.0 on pci1 da2 at ahc2 bus 0 target 0 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 3.300MB/s transfers da2: 1525878MB (3124999168 512 byte sectors: 255H 63S/T 194522C) The inner disks are OK: da0: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) When I issue a dd, I get exactly 3.3 MBps transfer rates: dd if=/dev/da2 of=/dev/null bs=1M ^C33+0 records in 33+0 records out 34603008 bytes transferred in 10.130446 secs (3415744 bytes/sec) Which is worse than that is a parallel read to eight (array as a JBOD, each disk as a RAID0 array) disks: L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 1 3 3 383 323.9 0 0 0.0 96.9| da2 1 3 3 383 323.9 0 0 0.0 96.9| da3 1 3 3 383 323.9 0 0 0.0 96.9| da4 1 3 3 383 323.9 0 0 0.0 96.9| da5 1 3 3 383 323.7 0 0 0.0 96.8| da6 1 3 3 383 323.9 0 0 0.0 96.9| da7 1 3 3 383 323.9 0 0 0.0 96.9| da8 1 3 3 383 323.8 0 0 0.0 96.8| da9 It makes the array pretty useless... However, when I do a: camcontrol negotiate da2 -a -W 16 Current Parameters: (pass3:ahc2:0:0:0): sync parameter: 0 (pass3:ahc2:0:0:0): offset: 0 (pass3:ahc2:0:0:0): bus width: 8 bits (pass3:ahc2:0:0:0): disconnection is disabled (pass3:ahc2:0:0:0): tagged queueing is disabled New Parameters: (pass3:ahc2:0:0:0): sync parameter: 0 (pass3:ahc2:0:0:0): offset: 0 (pass3:ahc2:0:0:0): bus width: 16 bits (pass3:ahc2:0:0:0): disconnection is disabled (pass3:ahc2:0:0:0): tagged queueing is disabled (TCQ, disconnection is disabled in the BIOS, because that was my first idea) After this, I get: dd if=/dev/da2 of=/dev/null bs=1M ^C76+0 records in 76+0 records out 79691776 bytes transferred in 11.871042 secs (6713124 bytes/sec) and: camcontrol inquiry da2 pass3: Fixed Direct Access SCSI-3 device pass3: Serial Number pass3: 6.600MB/s transfers (16bit) The transfer rate has doubled. I have tried to do a: camcontrol negotiate da2 -a -R 80 -W 16 Current Parameters: (pass3:ahc2:0:0:0): sync parameter: 0 (pass3:ahc2:0:0:0): offset: 0 (pass3:ahc2:0:0:0): bus width: 16 bits (pass3:ahc2:0:0:0): disconnection is disabled (pass3:ahc2:0:0:0): tagged queueing is disabled New Parameters: (pass3:ahc2:0:0:0): sync parameter: 0 (pass3:ahc2:0:0:0): offset: 0 (pass3:ahc2:0:0:0): bus width: 16 bits (pass3:ahc2:0:0:0): disconnection is disabled (pass3:ahc2:0:0:0): tagged queueing is disabled but without any effect. I can't show a dmesg now from the previous kernels, but this server could achieve 20-30 MBps from that array with FreeBSD 5,6 (when it went out of production) and certainly not just 3.3 MBps with 7-STABLE until a point. I have had a: @(#)FreeBSD 7.0-STABLE #4: Mon Jun 9 12:21:13 CEST 2008 kernel which worked reasonably well (performance wise, ZFS was unstable) and I first noticed the slowdowns when I've upgraded from that, first in the line of STABLE, then to CURRENT. Any ideas about that? Thanks, From bugmaster at FreeBSD.org Mon Dec 15 03:07:00 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 15 03:09:06 2008 Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org Message-ID: <200812151106.mBFB6xAx004465@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/126866 scsi [isp] [panic] kernel panic on card initialization o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/38828 scsi [dpt] [request] DPT PM2012B/90 doesn't work o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 29 problems total. From ptyll at nitronet.pl Wed Dec 17 14:36:06 2008 From: ptyll at nitronet.pl (Pawel Tyll) Date: Wed Dec 17 14:36:12 2008 Subject: Adaptec 52445 Message-ID: <1048137506.20081217231228@nitronet.pl> Hello list, I'm having problems with Adaptec 52445 controller. System doesn't detect drives connected to the controller as disks (I expect something like da0?). There are pass devices in the system, and 'camcontrol inquiry 0:0' returns: < ST3750330AS SD15> Fixed unknown SCSI-5 device, but still there is no way to use the disk and GEOM doesn't detect it. Second problem surfaced when looking for a workaround for the first one. After detaching the drive from the controller, it properly vanishes from arcconf utility, however FreeBSD is still very sure that the drive is attached. Pass device is still present and camcontrol lists the device in devlist, however accessing this device via simple volume created on it with arcconf, which is still present in the system even when it's faulty, as all components of this "volume" are gone, causes controller to crash. Has anyone had similar experience with this controller and got around them somehow? I suppose I could use RAID6 from the controller itself, but I'm hoping to give ZFS a try. Above happens on 7.0-PRERELEASE, 7.1-RC and with newest firmware and driver for the controller (build 16343) Cheers! From bu7cher at yandex.ru Wed Dec 17 23:53:24 2008 From: bu7cher at yandex.ru (Andrey V. Elsukov) Date: Wed Dec 17 23:53:31 2008 Subject: Adaptec 52445 In-Reply-To: <1048137506.20081217231228@nitronet.pl> References: <1048137506.20081217231228@nitronet.pl> Message-ID: <4949FECC.80706@yandex.ru> Pawel Tyll wrote: > I'm having problems with Adaptec 52445 controller. > > System doesn't detect drives connected to the controller as disks (I > expect something like da0?). There are pass devices in the system, and > 'camcontrol inquiry 0:0' returns: < ST3750330AS SD15> Fixed unknown > SCSI-5 device, but still there is no way to use the disk and GEOM > doesn't detect it. IMHO, you should create a logical volumes in controller BIOS. After that system will detect their as aacdX disks. -- WBR, Andrey V. Elsukov From yohimba at mail.ru Thu Dec 18 23:02:10 2008 From: yohimba at mail.ru (Vyacheslav I.) Date: Thu Dec 18 23:02:34 2008 Subject: Problem with FreeBSD for AMD64 & Mylex AcceleRAID Message-ID: Hi! I used to use the FreeBSD on i386 architecture together with the controller Mylex AcceleRAID 170.There was RAID 0+1 configured on it out of 5 discs (Enhanced Mirroring). # pciconf -lvc ... mly0@pci0:6:1:0: class=0x010400 card=0x00521069 chip=0x00501069 rev=0x02 hdr=0x00 vendor = 'Mylex Corp' device = 'AcceleRAID Disk Array' class = mass storage subclass = RAID cap 01[80] = powerspec 2 supports D0 D3 current D0 I used this configuration on i386 architecture with the following versions FreeBSD: 4.x, 5.x, 6.x and 7.0. It always worked perfectly. But recently I got needed 8 GB RAM so I had to install the version FreeBSD for AMD64. First I tried the 7.0, and after all - RELENG_7_1 dated 12.12.2008. Both these systems work with the files system located on RAID unstabl?! Meanwhile the same hardware on i386 architecture works with RELENG_7_1 perfectly. Hardware: MB INTEL DG33FB, CPU Intel(R) Core(TM)2 Quad CPU @ 2.40 GHz, RAM 8 GB. The base system is installed on a separate IDE disc. # df -h Filesystem Size Used Avail Capacity Mounted on /dev/ad4s1a 496M 111M 345M 24% / devfs 1.0K 1.0K 0B 100% /dev /dev/ad4s1d 3.9G 22K 3.6G 0% /tmp /dev/ad4s1f 333G 52G 255G 17% /usr /dev/ad4s1e 15G 41M 14G 0% /var /dev/da0s1d 70.7G 410K 70.1G 0% /mnt/da0/d I carried out the following test by creating 3 big sized files of 1 GB each, and then deleting them: # cd /mnt/da0/d # dd if=/dev/zero of=test1 bs=1024k count=1024 # dd if=/dev/zero of=test2 bs=1024k count=1024 # dd if=/dev/zero of=test5 bs=1024k count=1024 # rm test* After that I tried to unmount the files system, but I get a message of the kernel panic: # cd / # umount /mnt/da0/d ... bad block 123456789, ino 176 dev = da0s1d, block = 4, fs = /mnt/da0/d panic: ffs_blkfree: freeing free block cpuid = 3 Uptime: 34 min... The test was successful when using the i386 architecture with the same hardware Taking a piece of advice of my friend, I used a dirty patch: === [code] === diff -Nru src.orig/sys/cam/scsi/scsi_da.c src/sys/cam/scsi/scsi_da.c --- src.orig/sys/cam/scsi/scsi_da.c 2008-12-10 10:01:40.000000000 +0800 +++ src/sys/cam/scsi/scsi_da.c 2008-12-19 12:18:27.000000000 +0800 @@ -1187,7 +1187,7 @@ if (match != NULL) softc->quirks = ((struct da_quirk_entry *)match)->quirks; else - softc->quirks = DA_Q_NONE; + softc->quirks = DA_Q_NO_SYNC_CACHE; // Dirty hack for AMD64 /* Check if the SIM does not want 6 byte commands */ xpt_setup_ccb(&cpi.ccb_h, periph->path, /*priority*/1); === [/code] === This patch solved the problem and the systems stopped being panicking. But I assume that this solution is wrong as the hardware works on i386 architecture without this patch well. From raj at semihalf.com Fri Dec 19 03:00:05 2008 From: raj at semihalf.com (Rafal Jaworowski) Date: Fri Dec 19 03:00:11 2008 Subject: Better handling of TEST UNIT READY command Message-ID: <494B7C34.7050108@semihalf.com> Hi Scott, Can you have a look at this patch: http://people.freebsd.org/~raj/patches/misc/sys-cam-cam_xpt.diff Greg was trying to discuss this on scsi@ already, but there wasn't much response, so your input would be appreciated. The description of the problem and fix is included in the patch, but if you'd have any questions let me know. Rafal From scottl at samsco.org Fri Dec 19 07:21:59 2008 From: scottl at samsco.org (Scott Long) Date: Fri Dec 19 07:22:05 2008 Subject: Better handling of TEST UNIT READY command In-Reply-To: <494B7C34.7050108@semihalf.com> References: <494B7C34.7050108@semihalf.com> Message-ID: <494BB98C.3030404@samsco.org> Rafal Jaworowski wrote: > Hi Scott, > Can you have a look at this patch: > http://people.freebsd.org/~raj/patches/misc/sys-cam-cam_xpt.diff > > Greg was trying to discuss this on scsi@ already, but there wasn't much > response, so your input would be appreciated. > > The description of the problem and fix is included in the patch, but if you'd > have any questions let me know. > > Rafal Hi, Sorry I missed the original posting on the scsi list. I'm curious what ASC/ASCQ codes are being returned by the device. This patch is good enough for now, please go ahead and commit it, but I'd like to see what the actual errors are so we can see if there's a more elegant way to handle it. Scott From linimon at FreeBSD.org Sun Dec 21 21:32:30 2008 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Dec 21 21:32:36 2008 Subject: kern/129602: [ahd] ahd(4) gets confused and wedges SCSI bus Message-ID: <200812220532.mBM5WTBX080169@freefall.freebsd.org> Synopsis: [ahd] ahd(4) gets confused and wedges SCSI bus Responsible-Changed-From-To: freebsd-i386->freebsd-scsi Responsible-Changed-By: linimon Responsible-Changed-When: Mon Dec 22 05:32:06 UTC 2008 Responsible-Changed-Why: Probably not i386-specific. http://www.freebsd.org/cgi/query-pr.cgi?pr=129602 From bugmaster at FreeBSD.org Mon Dec 22 03:06:58 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 22 03:09:05 2008 Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org Message-ID: <200812221106.mBMB6v8X060700@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/126866 scsi [isp] [panic] kernel panic on card initialization o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/38828 scsi [dpt] [request] DPT PM2012B/90 doesn't work o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 30 problems total. From e.scholtz at argonsoft.de Sat Dec 27 15:12:32 2008 From: e.scholtz at argonsoft.de (Erik Scholtz, ArgonSoft GmbH) Date: Sat Dec 27 15:12:39 2008 Subject: Problem with disklabel and filesystem over iSCSI Message-ID: <4956B01B.3000509@argonsoft.de> Hi, the last days I tried to get a 1TB SAN to work with FreeBSD 7.x. I tried it with the following releases: FreeBSD web3 7.0-RELEASE FreeBSD 7.0-RELEASE #0: i386 FreeBSD 7.1-RC1 amd64 FreeBSD 7.1-RC2 amd64 Unfortunatly I could not get it to work. I think there is a problem with the disklabels. The UFS could not be written successfully. So I tried the same with ZFS. ZFS seems to work. After creating a pool and mounting the FS it can be used normally. But after unmounting the FS and rebooting the system, the FS is corrupted. Remounting the FS without rebooting works without any problems. Here the "log": *********************************************************************************** (iSCSI Session successfully created, the device is known as da0) (creating the partition) web3# fdisk /dev/da0s1 ******* Working on device /dev/da0s1 ******* parameters extracted from in-core disklabel are: cylinders=99693 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=99693 heads=255 sectors/track=63 (16065 blks/cyl) fdisk: invalid fdisk partition table found Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 63, size 1601567982 (782015 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 364/ head 254/ sector 63 The data for partition 2 is: The data for partition 3 is: The data for partition 4 is: (writeout the partition table) web3# dd if=/dev/da0 of=/partition1.bin bs=1 count=64 skip=446 seek=446 *********************************************************************************** (write the label) web3# bsdlabel -w /dev/da0s1 web3# bsdlabel /dev/da0s1 bsdlabel: /dev/da0s1: no valid label found (writeout the partition table) web3# dd if=/dev/da0 of=/partition2.bin bs=1 count=64 skip=446 seek=446 *********************************************************************************** (create the filesystem) newfs -O2 /dev/da0s1 /dev/da0s1: 782023.5MB (1601584044 sectors) block size 16384, fragment size 2048 using 4256 cylinder groups of 183.77MB, 11761 blks, 23552 inodes. super-block backups (for fsck -b #) at: 160, 376512, 752864, ... ... ... 1601377920 internal error: can't find block in cyl 0 (writeout the partition table) web3# dd if=/dev/da0 of=/partition3.bin bs=1 count=64 skip=446 seek=446 *********************************************************************************** (after the ufs was not successfull, the partition is destroyed again) (now trying to use a zfs) web3# zpool create tank da0 web3# zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 da0 ONLINE 0 0 0 errors: No known data errors *********************************************************************************** (list of mounted volumes) web3# mount /dev/ad4s1a on / (ufs, local) devfs on /dev (devfs, local) tank on /tank (zfs, local) *********************************************************************************** (copy a file to the zfs and check md5 before and after) web3# md5 /sbin/init MD5 (/sbin/init) = 6a374bc84a8b89822964e2a73ed2af18 web3# cp /sbin/init /tank/. web3# md5 /tank/init MD5 (/tank/init) = 6a374bc84a8b89822964e2a73ed2af18 (the zfs volume can be unmounted and mounted again - md5 still correct) *********************************************************************************** (system reboot) (iSCSI Session reinitiated) web3# zpool status pool: tank state: FAULTED status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM tank FAULTED 0 0 0 corrupted data da0 UNAVAIL 0 0 0 corrupted data (FS not useable anymore - files are lost) *********************************************************************************** In the attachment you'll find the results of the dd. Greetings, Erik ----- ArgonSoft GmbH | Im Ermlisgrund 3 | 76337 Waldbronn Tel: +49 7243 71520 | Fax: +49 7243 715222 | http://www.argonsoft.de Umsatzsteuer-Identnummer: DE205762306 | Handelsregister: HRB2372E Gesch?ftsf?hrer: Erik Scholtz From danny at cs.huji.ac.il Sun Dec 28 01:46:59 2008 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sun Dec 28 01:47:06 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: <4956B01B.3000509@argonsoft.de> References: <4956B01B.3000509@argonsoft.de> Message-ID: > This is a multi-part message in MIME format. > --------------040502050705080206050206 > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > Content-Transfer-Encoding: quoted-printable > > Hi, > > the last days I tried to get a 1TB SAN to work with FreeBSD 7.x. I tried=20 > it with the following releases: > > FreeBSD web3 7.0-RELEASE FreeBSD 7.0-RELEASE #0: i386 > FreeBSD 7.1-RC1 amd64 > FreeBSD 7.1-RC2 amd64 > > Unfortunatly I could not get it to work. I think there is a problem with=20 > the disklabels. The UFS could not be written successfully. So I tried=20 > the same with ZFS. ZFS seems to work. After creating a pool and mounting=20 > the FS it can be used normally. But after unmounting the FS and=20 > rebooting the system, the FS is corrupted. Remounting the FS without=20 > rebooting works without any problems. > > Here the "log": > > *************************************************************************= > ********** > > (iSCSI Session successfully created, the device is known as da0) > (creating the partition) > > web3# fdisk /dev/da0s1 > ******* Working on device /dev/da0s1 ******* > parameters extracted from in-core disklabel are: > cylinders=3D99693 heads=3D255 sectors/track=3D63 (16065 blks/cyl) > > Figures below won't work with BIOS for partitions not in cyl 1 > parameters to be used for BIOS calculations are: > cylinders=3D99693 heads=3D255 sectors/track=3D63 (16065 blks/cyl) > > fdisk: invalid fdisk partition table found > Media sector size is 512 > Warning: BIOS sector numbering starts with sector 1 > Information from DOS bootblock is: > The data for partition 1 is: > sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) > start 63, size 1601567982 (782015 Meg), flag 80 (active) > beg: cyl 0/ head 1/ sector 1; > end: cyl 364/ head 254/ sector 63 > The data for partition 2 is: > > The data for partition 3 is: > > The data for partition 4 is: > > > (writeout the partition table) > web3# dd if=3D/dev/da0 of=3D/partition1.bin bs=3D1 count=3D64 skip=3D446 = > seek=3D446 > > *************************************************************************= > ********** > > (write the label) > web3# bsdlabel -w /dev/da0s1 > > web3# bsdlabel /dev/da0s1 > bsdlabel: /dev/da0s1: no valid label found > > (writeout the partition table) > web3# dd if=3D/dev/da0 of=3D/partition2.bin bs=3D1 count=3D64 skip=3D446 = > seek=3D446 > > *************************************************************************= > ********** > > (create the filesystem) > newfs -O2 /dev/da0s1 > /dev/da0s1: 782023.5MB (1601584044 sectors) block size 16384, fragment=20 > size 2048 > using 4256 cylinder groups of 183.77MB, 11761 blks, 23552 inodes. > super-block backups (for fsck -b #) at: > 160, 376512, 752864, ... ... ... > 1601377920 > internal error: can't find block in cyl 0 > > (writeout the partition table) > web3# dd if=3D/dev/da0 of=3D/partition3.bin bs=3D1 count=3D64 skip=3D446 = > seek=3D446 > > *************************************************************************= > ********** > > (after the ufs was not successfull, the partition is destroyed again) > (now trying to use a zfs) > > web3# zpool create tank da0 > > web3# zpool status > pool: tank > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > da0 ONLINE 0 0 0 > > errors: No known data errors > > *************************************************************************= > ********** > > (list of mounted volumes) > > web3# mount > /dev/ad4s1a on / (ufs, local) > devfs on /dev (devfs, local) > tank on /tank (zfs, local) > > *************************************************************************= > ********** > > (copy a file to the zfs and check md5 before and after) > > web3# md5 /sbin/init > MD5 (/sbin/init) =3D 6a374bc84a8b89822964e2a73ed2af18 > web3# cp /sbin/init /tank/. > web3# md5 /tank/init > MD5 (/tank/init) =3D 6a374bc84a8b89822964e2a73ed2af18 > > (the zfs volume can be unmounted and mounted again - md5 still correct) > > *************************************************************************= > ********** > > (system reboot) > (iSCSI Session reinitiated) > > web3# zpool status > pool: tank > state: FAULTED > status: One or more devices could not be used because the label is missin= > g > or invalid. There are insufficient replicas for the pool to continue > functioning. > action: Destroy and re-create the pool from a backup source. > see: http://www.sun.com/msg/ZFS-8000-5E > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank FAULTED 0 0 0 corrupted data > da0 UNAVAIL 0 0 0 corrupted data > > (FS not useable anymore - files are lost) > > *************************************************************************= > ********** > > In the attachment you'll find the results of the dd. > > Greetings, > Erik hi, Which iSCSI initiator are you using? danny From e.scholtz at argonsoft.de Sun Dec 28 02:09:17 2008 From: e.scholtz at argonsoft.de (Erik Scholtz, ArgonSoft GmbH) Date: Sun Dec 28 02:09:24 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: References: <4956B01B.3000509@argonsoft.de> Message-ID: <49575045.4090601@argonsoft.de> Hi, I tested the default iSCSI_initiator shipped with 7.0, 7.1RC1 and 7.1RC2. Additionally i changed it with that version from ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.1.tar.gz with each release. The effect is with all combinations the same: 7.0 native / 7.0 + iscsi-2.1 7.1RC1 native / 7.1RC1 + iscsi-2.1 7.2RC2 native / 7.2RC2 + iscsi-2.1 Additionally info + tests: -------------------------- 1) I also checked it for ufs in dangerously dedicated mode - also the same effect. 2) After the reboot, the iSCSI device is back on /dev/da0 as expected. 3) The SAN system is a Hardware-SAN (iStor / GigaStor), that works without any problems under Ubuntu, CenOS, MacOS X, Windows and RedHat (all tested the last days) 4) I could get ufs to work with the following (terribly wrong) partition map: 0 40 39 - 12 unused 0 40 409600 409639 da0s1 165 FreeBSD 0 409640 1928708016 1929117655 da0s2 165 FreeBSD 0 1929117656 262184 1929379839 - 12 unused 0 With this partition map, newfs runs without any failure. The filesystem is heavily damaged and can be repaired with fsck. After repairing, the fs can be mounted and used as normal. But when running a fsck, thousands of errors must be corrected and the result is an empty disk again (when answering all with YES - I ran it with -y flag, since there are too many questions to be answered manually, even when copying only three big files). Greetings, Erik ----- ArgonSoft GmbH | Im Ermlisgrund 3 | 76337 Waldbronn Tel: +49 7243 71520 | Fax: +49 7243 715222 | http://www.argonsoft.de Umsatzsteuer-Identnummer: DE205762306 | Handelsregister: HRB2372E Gesch?ftsf?hrer: Erik Scholtz Danny Braniss wrote: > > hi, > Which iSCSI initiator are you using? > > danny > > >> ----------------------------------- >> >> Hi, >> >> the last days I tried to get a 1TB SAN to work with FreeBSD 7.x. I tried=20 >> it with the following releases: >> >> FreeBSD web3 7.0-RELEASE FreeBSD 7.0-RELEASE #0: i386 >> FreeBSD 7.1-RC1 amd64 >> FreeBSD 7.1-RC2 amd64 >> >> Unfortunatly I could not get it to work. I think there is a problem with=20 >> the disklabels. The UFS could not be written successfully. So I tried=20 >> the same with ZFS. ZFS seems to work. After creating a pool and mounting=20 >> the FS it can be used normally. But after unmounting the FS and=20 >> rebooting the system, the FS is corrupted. Remounting the FS without=20 >> rebooting works without any problems. >> >> Here the "log": >> >> *************************************************************************= >> ********** >> >> (iSCSI Session successfully created, the device is known as da0) >> (creating the partition) >> >> web3# fdisk /dev/da0s1 >> ******* Working on device /dev/da0s1 ******* >> parameters extracted from in-core disklabel are: >> cylinders=3D99693 heads=3D255 sectors/track=3D63 (16065 blks/cyl) >> >> Figures below won't work with BIOS for partitions not in cyl 1 >> parameters to be used for BIOS calculations are: >> cylinders=3D99693 heads=3D255 sectors/track=3D63 (16065 blks/cyl) >> >> fdisk: invalid fdisk partition table found >> Media sector size is 512 >> Warning: BIOS sector numbering starts with sector 1 >> Information from DOS bootblock is: >> The data for partition 1 is: >> sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) >> start 63, size 1601567982 (782015 Meg), flag 80 (active) >> beg: cyl 0/ head 1/ sector 1; >> end: cyl 364/ head 254/ sector 63 >> The data for partition 2 is: >> >> The data for partition 3 is: >> >> The data for partition 4 is: >> >> >> (writeout the partition table) >> web3# dd if=3D/dev/da0 of=3D/partition1.bin bs=3D1 count=3D64 skip=3D446 = >> seek=3D446 >> >> *************************************************************************= >> ********** >> >> (write the label) >> web3# bsdlabel -w /dev/da0s1 >> >> web3# bsdlabel /dev/da0s1 >> bsdlabel: /dev/da0s1: no valid label found >> >> (writeout the partition table) >> web3# dd if=3D/dev/da0 of=3D/partition2.bin bs=3D1 count=3D64 skip=3D446 = >> seek=3D446 >> >> *************************************************************************= >> ********** >> >> (create the filesystem) >> newfs -O2 /dev/da0s1 >> /dev/da0s1: 782023.5MB (1601584044 sectors) block size 16384, fragment=20 >> size 2048 >> using 4256 cylinder groups of 183.77MB, 11761 blks, 23552 inodes. >> super-block backups (for fsck -b #) at: >> 160, 376512, 752864, ... ... ... >> 1601377920 >> internal error: can't find block in cyl 0 >> >> (writeout the partition table) >> web3# dd if=3D/dev/da0 of=3D/partition3.bin bs=3D1 count=3D64 skip=3D446 = >> seek=3D446 >> >> *************************************************************************= >> ********** >> >> (after the ufs was not successfull, the partition is destroyed again) >> (now trying to use a zfs) >> >> web3# zpool create tank da0 >> >> web3# zpool status >> pool: tank >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> da0 ONLINE 0 0 0 >> >> errors: No known data errors >> >> *************************************************************************= >> ********** >> >> (list of mounted volumes) >> >> web3# mount >> /dev/ad4s1a on / (ufs, local) >> devfs on /dev (devfs, local) >> tank on /tank (zfs, local) >> >> *************************************************************************= >> ********** >> >> (copy a file to the zfs and check md5 before and after) >> >> web3# md5 /sbin/init >> MD5 (/sbin/init) =3D 6a374bc84a8b89822964e2a73ed2af18 >> web3# cp /sbin/init /tank/. >> web3# md5 /tank/init >> MD5 (/tank/init) =3D 6a374bc84a8b89822964e2a73ed2af18 >> >> (the zfs volume can be unmounted and mounted again - md5 still correct) >> >> *************************************************************************= >> ********** >> >> (system reboot) >> (iSCSI Session reinitiated) >> >> web3# zpool status >> pool: tank >> state: FAULTED >> status: One or more devices could not be used because the label is missin= >> g >> or invalid. There are insufficient replicas for the pool to continue >> functioning. >> action: Destroy and re-create the pool from a backup source. >> see: http://www.sun.com/msg/ZFS-8000-5E >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> tank FAULTED 0 0 0 corrupted data >> da0 UNAVAIL 0 0 0 corrupted data >> >> (FS not useable anymore - files are lost) >> >> *************************************************************************= >> ********** >> >> In the attachment you'll find the results of the dd. >> >> Greetings, >> Erik From danny at cs.huji.ac.il Sun Dec 28 03:21:31 2008 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sun Dec 28 03:21:40 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: <49575045.4090601@argonsoft.de> References: <4956B01B.3000509@argonsoft.de> <49575045.4090601@argonsoft.de> Message-ID: > Hi, > > I tested the default iSCSI_initiator shipped with 7.0, 7.1RC1 and > 7.1RC2. Additionally i changed it with that version from > ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.1.tar.gz with each > release. The effect is with all combinations the same: > > 7.0 native / 7.0 + iscsi-2.1 > 7.1RC1 native / 7.1RC1 + iscsi-2.1 > 7.2RC2 native / 7.2RC2 + iscsi-2.1 > > > Additionally info + tests: > -------------------------- > 1) I also checked it for ufs in dangerously dedicated mode - also the > same effect. > > 2) After the reboot, the iSCSI device is back on /dev/da0 as expected. > > 3) The SAN system is a Hardware-SAN (iStor / GigaStor), that works > without any problems under Ubuntu, CenOS, MacOS X, Windows and RedHat > (all tested the last days) > > 4) I could get ufs to work with the following (terribly wrong) partition > map: > > 0 40 39 - 12 unused 0 > 40 409600 409639 da0s1 165 FreeBSD 0 > 409640 1928708016 1929117655 da0s2 165 FreeBSD 0 > 1929117656 262184 1929379839 - 12 unused 0 > > With this partition map, newfs runs without any failure. The filesystem > is heavily damaged and can be repaired with fsck. After repairing, the > fs can be mounted and used as normal. But when running a fsck, thousands > of errors must be corrected and the result is an empty disk again (when > answering all with YES - I ran it with -y flag, since there are too many > questions to be answered manually, even when copying only three big > files). ok, so the problem is on shutdown/reboot. Buffers don't seem to be flushed. To check if this is correct, try shutdown, then under single user unmount the iscsi, sync, sync, reboot danny From e.scholtz at argonsoft.de Sun Dec 28 04:29:23 2008 From: e.scholtz at argonsoft.de (Erik Scholtz, ArgonSoft GmbH) Date: Sun Dec 28 04:29:31 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: References: <4956B01B.3000509@argonsoft.de> <49575045.4090601@argonsoft.de> Message-ID: <49577118.1080700@argonsoft.de> ok, i started the system in single-user-mode and brought up the iSCSI connection. I tried to install the ufs first, after each step I synced. No change - problem remains the same (newfs is failing). Then I installed the ZFS again, also synced after each step, then unmounted the iSCSI device, synced again (always at least three times). Then rebooted the system (going to single-user-mode again) and the ZFS is corrupted again with the same message. I don't think the buffers are the problem. Erik ----- ArgonSoft GmbH | Im Ermlisgrund 3 | 76337 Waldbronn Tel: +49 7243 71520 | Fax: +49 7243 715222 | http://www.argonsoft.de Umsatzsteuer-Identnummer: DE205762306 | Handelsregister: HRB2372E Gesch?ftsf?hrer: Erik Scholtz Danny Braniss wrote: > > ok, so the problem is on shutdown/reboot. Buffers don't seem to be > flushed. > To check if this is correct, try shutdown, then under single user > unmount the iscsi, sync, sync, reboot > > danny > >> Hi, >> >> I tested the default iSCSI_initiator shipped with 7.0, 7.1RC1 and >> 7.1RC2. Additionally i changed it with that version from >> ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.1.tar.gz with each >> release. The effect is with all combinations the same: >> >> 7.0 native / 7.0 + iscsi-2.1 >> 7.1RC1 native / 7.1RC1 + iscsi-2.1 >> 7.2RC2 native / 7.2RC2 + iscsi-2.1 >> >> >> Additionally info + tests: >> -------------------------- >> 1) I also checked it for ufs in dangerously dedicated mode - also the >> same effect. >> >> 2) After the reboot, the iSCSI device is back on /dev/da0 as expected. >> >> 3) The SAN system is a Hardware-SAN (iStor / GigaStor), that works >> without any problems under Ubuntu, CenOS, MacOS X, Windows and RedHat >> (all tested the last days) >> >> 4) I could get ufs to work with the following (terribly wrong) partition >> map: >> >> 0 40 39 - 12 unused 0 >> 40 409600 409639 da0s1 165 FreeBSD 0 >> 409640 1928708016 1929117655 da0s2 165 FreeBSD 0 >> 1929117656 262184 1929379839 - 12 unused 0 >> >> With this partition map, newfs runs without any failure. The filesystem >> is heavily damaged and can be repaired with fsck. After repairing, the >> fs can be mounted and used as normal. But when running a fsck, thousands >> of errors must be corrected and the result is an empty disk again (when >> answering all with YES - I ran it with -y flag, since there are too many >> questions to be answered manually, even when copying only three big >> files). From danny at cs.huji.ac.il Sun Dec 28 05:17:17 2008 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sun Dec 28 05:17:24 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: <49577118.1080700@argonsoft.de> References: <4956B01B.3000509@argonsoft.de> <49575045.4090601@argonsoft.de> <49577118.1080700@argonsoft.de> Message-ID: > ok, i started the system in single-user-mode and brought up the iSCSI > connection. I tried to install the ufs first, after each step I synced. > No change - problem remains the same (newfs is failing). > > Then I installed the ZFS again, also synced after each step, then > unmounted the iSCSI device, synced again (always at least three times). > Then rebooted the system (going to single-user-mode again) and the ZFS > is corrupted again with the same message. > in the case of zfs, you have to /etc/rc.d/zfs stop > I don't think the buffers are the problem. > are you sure that the iscsi target is only used by the freebsd host? > Erik > > ----- > ArgonSoft GmbH | Im Ermlisgrund 3 | 76337 Waldbronn > Tel: +49 7243 71520 | Fax: +49 7243 715222 | http://www.argonsoft.de > Umsatzsteuer-Identnummer: DE205762306 | Handelsregister: HRB2372E > Gesch?ftsf?hrer: Erik Scholtz > > Danny Braniss wrote: > > > > ok, so the problem is on shutdown/reboot. Buffers don't seem to be > > flushed. > > To check if this is correct, try shutdown, then under single user > > unmount the iscsi, sync, sync, reboot > > > > danny > > > >> Hi, > >> > >> I tested the default iSCSI_initiator shipped with 7.0, 7.1RC1 and > >> 7.1RC2. Additionally i changed it with that version from > >> ftp://ftp.cs.huji.ac.il/users/danny/freebsd/iscsi-2.1.tar.gz with each > >> release. The effect is with all combinations the same: > >> > >> 7.0 native / 7.0 + iscsi-2.1 > >> 7.1RC1 native / 7.1RC1 + iscsi-2.1 > >> 7.2RC2 native / 7.2RC2 + iscsi-2.1 > >> > >> > >> Additionally info + tests: > >> -------------------------- > >> 1) I also checked it for ufs in dangerously dedicated mode - also the > >> same effect. > >> > >> 2) After the reboot, the iSCSI device is back on /dev/da0 as expected. > >> > >> 3) The SAN system is a Hardware-SAN (iStor / GigaStor), that works > >> without any problems under Ubuntu, CenOS, MacOS X, Windows and RedHat > >> (all tested the last days) > >> > >> 4) I could get ufs to work with the following (terribly wrong) partition > >> map: > >> > >> 0 40 39 - 12 unused 0 > >> 40 409600 409639 da0s1 165 FreeBSD 0 > >> 409640 1928708016 1929117655 da0s2 165 FreeBSD 0 > >> 1929117656 262184 1929379839 - 12 unused 0 > >> > >> With this partition map, newfs runs without any failure. The filesystem > >> is heavily damaged and can be repaired with fsck. After repairing, the > >> fs can be mounted and used as normal. But when running a fsck, thousands > >> of errors must be corrected and the result is an empty disk again (when > >> answering all with YES - I ran it with -y flag, since there are too many > >> questions to be answered manually, even when copying only three big > >> files). From e.scholtz at argonsoft.de Sun Dec 28 05:42:39 2008 From: e.scholtz at argonsoft.de (Erik Scholtz, ArgonSoft GmbH) Date: Sun Dec 28 05:42:46 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: References: <4956B01B.3000509@argonsoft.de> <49575045.4090601@argonsoft.de> <49577118.1080700@argonsoft.de> Message-ID: <4957824C.1000601@argonsoft.de> Danny Braniss wrote: >> ok, i started the system in single-user-mode and brought up the iSCSI >> connection. I tried to install the ufs first, after each step I synced. >> No change - problem remains the same (newfs is failing). >> >> Then I installed the ZFS again, also synced after each step, then >> unmounted the iSCSI device, synced again (always at least three times). >> Then rebooted the system (going to single-user-mode again) and the ZFS >> is corrupted again with the same message. >> > in the case of zfs, you have to > /etc/rc.d/zfs stop > I used the "zfs umount tank" command. I now tried it with "/etc/rc.d/zfs stop" (both cases: before and after umounting the volume via "zfs umount") and synced again. Still the effect remains the same: After reboot the filesystem is corrupted and unreadable. >> I don't think the buffers are the problem. >> > are you sure that the iscsi target is only used by the freebsd host? 100% for sure. The SAN is (at the moment) connected via Cross-Cable directly to the FreeBSD-Box, so has only a dedicated connection. Before that it was connected over a seperate Cisco 520G Switch, only used by the SAN and the FreeBSD Box. The networkcard is a INTEL 1GBit. Also tried it with a nVidia nForce Pro 3400 (also 1GBit). I also played with the MTU sizes: at the moment it is default (1500) - but other sizes (jumbo size 9000) also do not change anything (I made sure, both sides (SAN and FreeBSD) uses the same size). Erik ----- ArgonSoft GmbH | Im Ermlisgrund 3 | 76337 Waldbronn Tel: +49 7243 71520 | Fax: +49 7243 715222 | http://www.argonsoft.de Umsatzsteuer-Identnummer: DE205762306 | Handelsregister: HRB2372E Gesch?ftsf?hrer: Erik Scholtz From danny at cs.huji.ac.il Sun Dec 28 06:14:28 2008 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sun Dec 28 06:14:35 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: <4957824C.1000601@argonsoft.de> References: <4956B01B.3000509@argonsoft.de> <49575045.4090601@argonsoft.de> <49577118.1080700@argonsoft.de> <4957824C.1000601@argonsoft.de> Message-ID: > Danny Braniss wrote: > >> ok, i started the system in single-user-mode and brought up the iSCSI > >> connection. I tried to install the ufs first, after each step I synced. > >> No change - problem remains the same (newfs is failing). > >> > >> Then I installed the ZFS again, also synced after each step, then > >> unmounted the iSCSI device, synced again (always at least three times). > >> Then rebooted the system (going to single-user-mode again) and the ZFS > >> is corrupted again with the same message. > >> > > in the case of zfs, you have to > > /etc/rc.d/zfs stop > > > > I used the "zfs umount tank" command. I now tried it with "/etc/rc.d/zfs > stop" (both cases: before and after umounting the volume via "zfs > umount") and synced again. Still the effect remains the same: After > reboot the filesystem is corrupted and unreadable. > > >> I don't think the buffers are the problem. > >> > > are you sure that the iscsi target is only used by the freebsd host? > > 100% for sure. The SAN is (at the moment) connected via Cross-Cable > directly to the FreeBSD-Box, so has only a dedicated connection. Before > that it was connected over a seperate Cisco 520G Switch, only used by > the SAN and the FreeBSD Box. > The networkcard is a INTEL 1GBit. Also tried it with a nVidia nForce Pro > 3400 (also 1GBit). I also played with the MTU sizes: at the moment it is > default (1500) - but other sizes (jumbo size 9000) also do not change > anything (I made sure, both sides (SAN and FreeBSD) uses the same size). Since you seem to be in a cooperative mood :-), can you try checking with a smaller volume? say 200GB? either UFS or ZFS? the biggest I have access to is 921595 Meg and it does not show your problems. also, what is the Media Sector size? fdisk da0 should show it. cheers, danny From e.scholtz at argonsoft.de Sun Dec 28 06:41:26 2008 From: e.scholtz at argonsoft.de (Erik Scholtz, ArgonSoft GmbH) Date: Sun Dec 28 06:41:41 2008 Subject: Problem with disklabel and filesystem over iSCSI In-Reply-To: References: <4956B01B.3000509@argonsoft.de> <49575045.4090601@argonsoft.de> <49577118.1080700@argonsoft.de> <4957824C.1000601@argonsoft.de> Message-ID: <4957900D.80601@argonsoft.de> Well, I'm trying to find a solution to this problem - I already spend 6 whole days in testing, downloading and reading "FM"s, "HowTo"s and Mailinglists. I don't want to waste the time of others, so giving as lot information as possible and do the testing is the least I can do. I downgraded the SAN from 1Tb to 200GB: ------------------------------------------------------------ web3# fdisk da0 ******* Working on device /dev/da0 ******* parameters extracted from in-core disklabel are: cylinders=26108 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=26108 heads=255 sectors/track=63 (16065 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 ------------------------------------------------------------ Partition: 0 63 62 - 12 unused 0 63 419424957 419425019 da0s1 8 freebsd 165 419425020 5380 419430399 - 12 unused 0 1) Result of newfs in compatible mode: 419256288 internal error: can't find block in cyl 0 2) Result of newfs in dangerously dedicated mode: ..., ..., ... 419256288 cg 0: bad magic number ZFS remains the same (all cases: before and after umounting the volume via "zfs umount" executing the "/etc/rc.d/zfs stop" - sync before and after zfs stop). When starting the zfs without rebooting, the FS is still ok. After reboot the FS is broken again. Again the fdisk output before the SAN was downgraded: >web3# fdisk /dev/da0s1 > ******* Working on device /dev/da0s1 ******* > parameters extracted from in-core disklabel are: > cylinders=3D99693 heads=3D255 sectors/track=3D63 (16065 blks/cyl) > > Figures below won't work with BIOS for partitions not in cyl 1 > parameters to be used for BIOS calculations are: > cylinders=3D99693 heads=3D255 sectors/track=3D63 (16065 blks/cyl) Erik ----- ArgonSoft GmbH | Im Ermlisgrund 3 | 76337 Waldbronn Tel: +49 7243 71520 | Fax: +49 7243 715222 | http://www.argonsoft.de Umsatzsteuer-Identnummer: DE205762306 | Handelsregister: HRB2372E Gesch?ftsf?hrer: Erik Scholtz Danny Braniss wrote: > Since you seem to be in a cooperative mood :-), can you try checking > with a smaller volume? say 200GB? either UFS or ZFS? > the biggest I have access to is 921595 Meg and it does not show your problems. > also, what is the Media Sector size? > fdisk da0 should show it. > > cheers, > danny > > From danny at cs.huji.ac.il Sun Dec 28 23:54:48 2008 From: danny at cs.huji.ac.il (Danny Braniss) Date: Sun Dec 28 23:55:02 2008 Subject: CAM and scsi-5 Message-ID: Hi, are there any issues with CAM and scsi-5? I'm asking, because when the iSCSI target is an SCSI-5 unit, all seems ok, but: sade -- ok newfs -- ok fsck -- failes! no scsi/cam errors, but ... thanks, danny From bugmaster at FreeBSD.org Mon Dec 29 03:07:02 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 29 03:09:02 2008 Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org Message-ID: <200812291107.mBTB718s024570@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/126866 scsi [isp] [panic] kernel panic on card initialization o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123666 scsi [aac] attach fails with Adaptec SAS RAID 3805 controll o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/119668 scsi [cam] [patch] certain errors are too verbose comparing o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/38828 scsi [dpt] [request] DPT PM2012B/90 doesn't work o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 30 problems total. From gjb at semihalf.com Tue Dec 30 10:00:05 2008 From: gjb at semihalf.com (Grzegorz Bernacki) Date: Tue Dec 30 10:00:12 2008 Subject: Better handling of TEST UNIT READY command In-Reply-To: <494B7C34.7050108@semihalf.com> References: <494B7C34.7050108@semihalf.com> Message-ID: <4959ED07.1010101@semihalf.com> Hi Scott, > Sorry I missed the original posting on the scsi list. I'm curious what > ASC/ASCQ codes are being returned by the device. This patch is good > enough for now, please go ahead and commit it, but I'd like to see what > the actual errors are so we can see if there's a more elegant way to > handle it. Those are values returned in sense data: Error Code 0x70 Sense Key 0x06 ASC 0x28 ASCQ 0x00 pozdrawiam, Grzesiek