From borjam at sarenet.es Mon Sep 1 15:31:31 2014 From: borjam at sarenet.es (Borja Marcos) Date: Mon, 1 Sep 2014 17:21:18 +0200 Subject: Samsung 840 Pro SSD and quirks Message-ID: Hi, I have just noticed that the Samsung 840 SSDs now have the 4 KB block quirk added. Is this really the case? I've been playing with them some time ago and I didn't notice performance differences between using ZFS on them either "directly" (advertised 512 byte blocks) or forcing 4 KB blocks using gnop. Just surprised, I didn't find references to the true block size. Borja. From killing at multiplay.co.uk Mon Sep 1 15:44:12 2014 From: killing at multiplay.co.uk (Steven Hartland) Date: Mon, 1 Sep 2014 16:44:00 +0100 Subject: Samsung 840 Pro SSD and quirks References: Message-ID: We saw a noticable performance increase on 4k on our 8TB 840 array but I too couldn't find any concrete information either. If anyone has this info and can confirm either way that would be great. Regards Steve ----- Original Message ----- From: "Borja Marcos" > > Hi, > > I have just noticed that the Samsung 840 SSDs now have the 4 KB block > quirk added. > > Is this really the case? I've been playing with them some time ago and > I didn't notice performance differences between using ZFS on them > either > "directly" (advertised 512 byte blocks) or forcing 4 KB blocks using > gnop. > > Just surprised, I didn't find references to the true block size. From borjam at sarenet.es Mon Sep 1 16:11:54 2014 From: borjam at sarenet.es (Borja Marcos) Date: Mon, 1 Sep 2014 18:11:49 +0200 Subject: Samsung 840 Pro SSD and quirks In-Reply-To: References: Message-ID: On Sep 1, 2014, at 5:44 PM, Steven Hartland wrote: > We saw a noticable performance increase on 4k on our 8TB 840 > array but I too couldn't find any concrete information either. > > If anyone has this info and can confirm either way that would > be great. I don?t have actual numbers, just recalling that I tried and I didn't find significant differences using bonnie++ on a ZFS pool. And I recall that according to the kstats.sysctl variables, trim was indeed working. Just in case I am repeating the tests right now. I still have the pre-quirks kernel around and I have a pool defined with the default 512 byte blocks. Version 1.97 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP elibm 96G 123 99 670496 97 310330 63 303 99 818483 56 6281 165 Latency 93190us 20227us 448ms 41198us 454ms 26375us Version 1.97 ------Sequential Create------ --------Random Create-------- elibm -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 25723 98 +++++ +++ 24559 98 12694 99 31135 100 4810 99 Latency 15192us 97us 130us 23708us 355us 1199us 1.97,1.97,elibm,1,1409588162,96G,,123,99,670496,97,310330,63,303,99,818483,56,6281,165,16,,,,,25723,98,+++++,+++,24559,98,12694,99,31135,100,4810,99,93190us,20227us,448ms,41198us,454ms,26375us,15192us,97us,130us,23708us,355us,1199us After a reboot, destroyng and recreating the pool, Version 1.97 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP elibm 96G 128 99 675094 98 323692 67 303 99 862380 58 9530 189 Latency 64726us 48676us 389ms 36398us 505ms 15594us Version 1.97 ------Sequential Create------ --------Random Create-------- elibm -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 24857 97 +++++ +++ 20422 98 21836 98 +++++ +++ 17786 97 Latency 15422us 102us 785us 24590us 125us 170us 1.97,1.97,elibm,1,1409588443,96G,,128,99,675094,98,323692,67,303,99,862380,58,9530,189,16,,,,,24857,97,+++++,+++,20422,98,21836,98,+++++,+++,17786,97,64726us,48676us,389ms,36398us,505ms,15594us,15422us,102us,785us,24590us,125us,170us The results seem to be more or less similar. I have checked kstats.zfs and in both cases trim was working. The count of unsupported trims was 0 while success and bytes grew as they should. What am I missing? Note that I am not against preemptive 4K quirk strikes :) I am comparing with multiple concurrent bonnies just in case or, what did you use to do the test? Thanks! Borja. From borjam at sarenet.es Tue Sep 2 14:57:00 2014 From: borjam at sarenet.es (Borja Marcos) Date: Tue, 2 Sep 2014 16:48:26 +0200 Subject: Samsung 840 Pro SSD and quirks In-Reply-To: References: Message-ID: <93D764A8-01AE-42FA-8020-65CEB6C7D64C@sarenet.es> On Sep 1, 2014, at 5:44 PM, Steven Hartland wrote: > We saw a noticable performance increase on 4k on our 8TB 840 > array but I too couldn't find any concrete information either. > > If anyone has this info and can confirm either way that would > be great. I stand corrected. I have done some benchmarks with just two Samsung SSDs (zpool with two disks, no mirroring) and indeed I get better performance with 4 KB blocks. I did my original tests with 12 disks and some other bottleneck was hiding the performance difference. In both cases, anyway, Trim was working unless the system lies. Thanks! From killing at multiplay.co.uk Tue Sep 2 15:07:48 2014 From: killing at multiplay.co.uk (Steven Hartland) Date: Tue, 2 Sep 2014 16:07:45 +0100 Subject: Samsung 840 Pro SSD and quirks References: <93D764A8-01AE-42FA-8020-65CEB6C7D64C@sarenet.es> Message-ID: <14D38CFE3887426D9065E26F50457F30@multiplay.co.uk> Thanks for the confirmation Borja I was a little confused why our two results differed. For a 12 disk system you'll likely need two SAS2 controllers, or at least 12 SAS lines otherwise you will hit controller throughput issues as a 840 can pretty much saturate a single SAS2 lane on its own. At that point you'll also start to see other issues. I'd strongly suggest moving to stable/10, if you haven't already, particularly if you have large amount of RAM in the system otherwise you will become CPU bound on ARC hash lookups. Regards Steve ----- Original Message ----- From: "Borja Marcos" To: "Steven Hartland" Cc: "FreeBSD-scsi" Sent: Tuesday, September 02, 2014 3:48 PM Subject: Re: Samsung 840 Pro SSD and quirks On Sep 1, 2014, at 5:44 PM, Steven Hartland wrote: > We saw a noticable performance increase on 4k on our 8TB 840 > array but I too couldn't find any concrete information either. > > If anyone has this info and can confirm either way that would > be great. I stand corrected. I have done some benchmarks with just two Samsung SSDs (zpool with two disks, no mirroring) and indeed I get better performance with 4 KB blocks. I did my original tests with 12 disks and some other bottleneck was hiding the performance difference. In both cases, anyway, Trim was working unless the system lies. Thanks! From borjam at sarenet.es Tue Sep 2 15:17:32 2014 From: borjam at sarenet.es (Borja Marcos) Date: Tue, 2 Sep 2014 17:17:28 +0200 Subject: Samsung 840 Pro SSD and quirks In-Reply-To: <14D38CFE3887426D9065E26F50457F30@multiplay.co.uk> References: <93D764A8-01AE-42FA-8020-65CEB6C7D64C@sarenet.es> <14D38CFE3887426D9065E26F50457F30@multiplay.co.uk> Message-ID: <59E9F0BF-1B42-40E9-BF1E-E7AFB60C3B27@sarenet.es> On Sep 2, 2014, at 5:07 PM, Steven Hartland wrote: > Thanks for the confirmation Borja I was a little confused why > our two results differed. What do you use for your benchmarks? I am still playing with this, so I can run the same tests just in case. I have done something pretty straightforward, just creating a pool, a dataset, and running bonnie++ on it. I also have a backplane > > For a 12 disk system you'll likely need two SAS2 controllers, > or at least 12 SAS lines otherwise you will hit controller > throughput issues as a 840 can pretty much saturate a single > SAS2 lane on its own. > > At that point you'll also start to see other issues. > > I'd strongly suggest moving to stable/10, if you haven't already, > particularly if you have large amount of RAM in the system > otherwise you will become CPU bound on ARC hash lookups. Yes, I'm following -STABLE but this braindead machine has just *one* PCIe slot, so I am limited to one controller. In my case, a SAS2008 (mps driver) with a SAS expander. mps0: port 0x3f00-0x3fff mem 0x90ebc000-0x90ebffff,0x912c0000-0x912fffff irq 32 at device 0.0 on pci17 mps0: Firmware: 18.00.00.00, Driver: 19.00.00.00-fbsd mps0: IOCCapabilities: 1285c Anyway, my main concern is not that maximum throughput, the system will be much faster than the same using "classic" hard disks :) Borja. From bugzilla-noreply at freebsd.org Tue Sep 9 16:21:03 2014 From: bugzilla-noreply at freebsd.org (bugzilla-noreply at freebsd.org) Date: Tue, 09 Sep 2014 16:21:03 +0000 Subject: [Bug 191717] [isci] smartctl -H gives "ATA output registers missing" for a disk using the isci driver In-Reply-To: References: Message-ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191717 Edward Tomasz Napierala changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |trasz at FreeBSD.org Summary|[iscsi] smartctl -H gives |[isci] smartctl -H gives |"ATA output registers |"ATA output registers |missing" for a disk using |missing" for a disk using |the isci driver |the isci driver --- Comment #2 from Edward Tomasz Napierala --- Unrelated to iSCSI. -- You are receiving this mail because: You are the assignee for the bug. From chuck at tuffli.net Wed Sep 10 19:43:36 2014 From: chuck at tuffli.net (Chuck Tuffli) Date: Wed, 10 Sep 2014 12:43:26 -0700 Subject: equivalent of Linux dev loss timeout? Message-ID: Linux SCSI initiator drivers have the ability to send transport events (e.g. link up/down) to a module that handles reporting and removing devices from the system. By setting a "dev loss timeout", users can specify how long a device can be missing before it is removed from the system. As one example of how this is used, enthusiastic testers can run IO to a device, yank out the cable between the initiator and the device for less than dev loss timeout seconds, and then watch as the IO's resume after the cable is reinserted. Does anything like this exist in CAM or would it be up to each SIM to implement this behavior? --chuck From trasz at FreeBSD.org Thu Sep 11 22:07:31 2014 From: trasz at FreeBSD.org (Edward Tomasz =?utf-8?Q?Napiera=C5=82a?=) Date: Fri, 12 Sep 2014 00:07:24 +0200 Subject: equivalent of Linux dev loss timeout? In-Reply-To: References: Message-ID: <20140911220724.GA3981@pc5.home> On 0910T1243, Chuck Tuffli wrote: > Linux SCSI initiator drivers have the ability to send transport events > (e.g. link up/down) to a module that handles reporting and removing > devices from the system. By setting a "dev loss timeout", users can > specify how long a device can be missing before it is removed from the > system. > > As one example of how this is used, enthusiastic testers can run IO to > a device, yank out the cable between the initiator and the device for > less than dev loss timeout seconds, and then watch as the IO's resume > after the cable is reinserted. > > Does anything like this exist in CAM or would it be up to each SIM to > implement this behavior? Not strictly a timeout, but if you're looking for a way to "suspend" all IO operations to a disk device when it disappears, and reexecute them when it's connected back, in a transparent way, then gmountver might do the trick. From xzpeter at gmail.com Tue Sep 16 09:37:14 2014 From: xzpeter at gmail.com (Xu Zhe) Date: Tue, 16 Sep 2014 17:36:48 +0800 Subject: How to disable hard disk write cache? Message-ID: <541804B0.7070407@gmail.com> Hi, all, Does anyone knows how to disable write cache of hard disk? I have found some hints here at Freebsd website: https://www.freebsd.org/doc/handbook/configtuning-disk.html However, this seems only work for ATA devices, what about SAS/NLSAS devices (Meanwhile, it seems that there is no such sysctl in latest Freebsd release, which is 10.0)? Any hints are welcomed! Thanks in advance. Peter From bram.vandoren at ster.kuleuven.be Tue Sep 16 12:46:31 2014 From: bram.vandoren at ster.kuleuven.be (Bram Vandoren) Date: Tue, 16 Sep 2014 14:46:18 +0200 Subject: How to disable hard disk write cache? In-Reply-To: <541804B0.7070407@gmail.com> References: <541804B0.7070407@gmail.com> Message-ID: <5418311A.3080007@ster.kuleuven.be> On 09/16/2014 11:36 AM, Xu Zhe wrote: > Does anyone knows how to disable write cache of hard disk? Have a look at camcontrol modepage and the WCE bit. Cheers, Bram. From dgilbert at interlog.com Tue Sep 16 13:50:35 2014 From: dgilbert at interlog.com (Douglas Gilbert) Date: Tue, 16 Sep 2014 09:50:14 -0400 Subject: How to disable hard disk write cache? In-Reply-To: <5418311A.3080007@ster.kuleuven.be> References: <541804B0.7070407@gmail.com> <5418311A.3080007@ster.kuleuven.be> Message-ID: <54184016.7060208@interlog.com> On 14-09-16 08:46 AM, Bram Vandoren wrote: > On 09/16/2014 11:36 AM, Xu Zhe wrote: > >> Does anyone knows how to disable write cache of hard disk? > > Have a look at camcontrol modepage and the WCE bit. You could also use the sdparm utility: sdparm -c WCE From delphij at delphij.net Tue Sep 16 14:09:10 2014 From: delphij at delphij.net (Xin Li) Date: Tue, 16 Sep 2014 22:09:08 +0800 Subject: How to disable hard disk write cache? In-Reply-To: <541804B0.7070407@gmail.com> References: <541804B0.7070407@gmail.com> Message-ID: <54184484.1070304@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 9/16/14 5:36 PM, Xu Zhe wrote: > Hi, all, > > Does anyone knows how to disable write cache of hard disk? > > I have found some hints here at Freebsd website: > > https://www.freebsd.org/doc/handbook/configtuning-disk.html > > However, this seems only work for ATA devices, what about SAS/NLSAS > devices (Meanwhile, it seems that there is no such sysctl in latest > Freebsd release, which is 10.0)? > > Any hints are welcomed! Why do you want to disable write cache in the first place? It's not needed for most configurations nowadays. Modern SATA/SAS/SCSI devices usually comes with the capability of tagged commands, allowing the OS to know when a write buffer is on stable storage. With this, file systems can easily implement the right semantics and recover from e.g. a power outage, etc. Cheers, -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJUGESDAAoJEJW2GBstM+ns2R4QAKKL2+TKfTaFQOWk4/jOKzdD 6yANm2ZYI3reChuUmbYjjchfce6nGcxz7EPjTOfBLj37p8bZFaW91e/ayviY9pNL QktP0hgTZp4EFLJSlPfqjx6f8aU2gJ640b/cKbIQkaxHWRLoHea8GJ2XhyVS9JfK o8aM+VPyxRrTRH0e2BQ8t0DMJohCrVNZ0fYSAKToDqI2RpcmOumYA4yChXb0hgvc Rz3PGZth19E4rxdNtOM0Ap/l4PL3+lInIlU8kPdwUaqvT11mxjUM+6zkF904VtqL 5xoURW/j9MAbFl6ozJJKnkfd20lCT3TfyQUC7weDOT8Jz78/8Sx2qy6ilkQCe0ZI AWV0hpoQPk2bzFxwB7wr9MiVbXJcpAHfHFcKTbLWY4NzY/8RknDt5dNsqsntlDww dSb1xRLdPQFBh7NrZA1GAfxXdfK8GNBONF+cRKutvFAlIzq0kywaa5baN1ZwLdU2 Kz3LJ0SLQMNEFioES+9j94fSeiW9LY9OJlI9EAjf2gys4WFeLkTgOgdBhvY935Mi j6FHk4xZmgKOS2s0c7UOYMjj6OgU1or4t2awMF/ObQwd8cmCT5abeGUdtrDGdCZu am/9nbCfR6FHhm5k90t+yCzl/hPf7dqVoz41bnP6+dD5q5zwxOhOc/SEiCJKCH3L iQoPiJfxH7JGZdCw/Vz+ =QSHo -----END PGP SIGNATURE----- From bugzilla-noreply at freebsd.org Tue Sep 16 22:00:32 2014 From: bugzilla-noreply at freebsd.org (bugzilla-noreply at freebsd.org) Date: Tue, 16 Sep 2014 22:00:32 +0000 Subject: [Bug 193696] CAM AC_FOUND_DEVICE calls malloc(M_WAITOK) from THREAD_NO_SLEEPING() context In-Reply-To: References: Message-ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193696 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mav at FreeBSD.org Assignee|freebsd-bugs at FreeBSD.org |freebsd-scsi at FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From xzpeter at gmail.com Wed Sep 17 03:04:49 2014 From: xzpeter at gmail.com (Xu Zhe) Date: Wed, 17 Sep 2014 11:04:42 +0800 Subject: How to disable hard disk write cache? In-Reply-To: <54184484.1070304@delphij.net> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> Message-ID: <5418FA4A.1050707@gmail.com> ? 14-9-16 22:09, Xin Li ??: > On 9/16/14 5:36 PM, Xu Zhe wrote: >> > Hi, all, >> > >> > Does anyone knows how to disable write cache of hard disk? >> > >> > I have found some hints here at Freebsd website: >> > >> > https://www.freebsd.org/doc/handbook/configtuning-disk.html >> > >> > However, this seems only work for ATA devices, what about SAS/NLSAS >> > devices (Meanwhile, it seems that there is no such sysctl in latest >> > Freebsd release, which is 10.0)? >> > >> > Any hints are welcomed! > Why do you want to disable write cache in the first place? It's not > needed for most configurations nowadays. > > Modern SATA/SAS/SCSI devices usually comes with the capability of > tagged commands, allowing the OS to know when a write buffer is on > stable storage. With this, file systems can easily implement the > right semantics and recover from e.g. a power outage, etc. > > Cheers, Hi, Xin, Thanks for the reply. My final goal is not to disable write cache. I just want to do a simple test to see how would the IOPS drop when write cache disabled (I suppose without disk write cache, the value should be the same as IOPS of random read, both using O_DIRECT flag to avoid system cache, etc.). Meanwhile, what I really want to know is that, how could I make sure IO is persistent as long as I got reply from hard disk? Since I suppose all file systems require this assumption to achieve self consistency. I have googled about the "tagged command" but only found something related to TCQ, which mainly talks about the queueing but not anything related to cache sync. Any more hints? I saw some pages that mentioned about SATA FUA command support on Linux (Which I guess is what I am looking for): https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt However, I have not found useful information related to Freebsd. :( Peter From nitroboost at gmail.com Wed Sep 17 06:49:05 2014 From: nitroboost at gmail.com (Jason Wolfe) Date: Tue, 16 Sep 2014 23:49:03 -0700 Subject: Samsung 840 Pro SSD and quirks In-Reply-To: <14D38CFE3887426D9065E26F50457F30@multiplay.co.uk> References: <93D764A8-01AE-42FA-8020-65CEB6C7D64C@sarenet.es> <14D38CFE3887426D9065E26F50457F30@multiplay.co.uk> Message-ID: We run the Samsung 843T drives, which are basically 840s with extra flash for over provisioning. We have directly from Samsung that they are in fact 4k drives, which led to the quirk addition for the 843T in r270305. Anyway, just wanted to round the thread out with some confirmation. Jason On Tue, Sep 2, 2014 at 8:07 AM, Steven Hartland wrote: > Thanks for the confirmation Borja I was a little confused why > our two results differed. > > For a 12 disk system you'll likely need two SAS2 controllers, > or at least 12 SAS lines otherwise you will hit controller > throughput issues as a 840 can pretty much saturate a single > SAS2 lane on its own. > > At that point you'll also start to see other issues. > > I'd strongly suggest moving to stable/10, if you haven't already, > particularly if you have large amount of RAM in the system > otherwise you will become CPU bound on ARC hash lookups. > > Regards > Steve > > ----- Original Message ----- From: "Borja Marcos" > To: "Steven Hartland" > Cc: "FreeBSD-scsi" > Sent: Tuesday, September 02, 2014 3:48 PM > Subject: Re: Samsung 840 Pro SSD and quirks > > > > > On Sep 1, 2014, at 5:44 PM, Steven Hartland wrote: > > We saw a noticable performance increase on 4k on our 8TB 840 >> array but I too couldn't find any concrete information either. >> >> If anyone has this info and can confirm either way that would >> be great. >> > > I stand corrected. I have done some benchmarks with just two Samsung SSDs > (zpool with two disks, no mirroring) and indeed I get better performance > with 4 KB blocks. > > I did my original tests with 12 disks and some other bottleneck was hiding > the performance difference. > > In both cases, anyway, Trim was working unless the system lies. > > > > > > Thanks! > > > _______________________________________________ > freebsd-scsi at freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe at freebsd.org" > From borjam at sarenet.es Wed Sep 17 06:52:05 2014 From: borjam at sarenet.es (Borja Marcos) Date: Wed, 17 Sep 2014 08:51:51 +0200 Subject: Samsung 840 Pro SSD and quirks In-Reply-To: References: <93D764A8-01AE-42FA-8020-65CEB6C7D64C@sarenet.es> <14D38CFE3887426D9065E26F50457F30@multiplay.co.uk> Message-ID: On Sep 17, 2014, at 8:49 AM, Jason Wolfe wrote: > We run the Samsung 843T drives, which are basically 840s with extra flash for over provisioning. We have directly from Samsung that they are in fact 4k drives, which led to the quirk addition for the 843T in r270305. Anyway, just wanted to round the thread out with some confirmation. Thank you. Some official word always helps! Borja. From bugzilla-noreply at freebsd.org Fri Sep 19 17:05:56 2014 From: bugzilla-noreply at freebsd.org (bugzilla-noreply at freebsd.org) Date: Fri, 19 Sep 2014 17:05:56 +0000 Subject: [Bug 193696] CAM AC_FOUND_DEVICE calls malloc(M_WAITOK) from THREAD_NO_SLEEPING() context In-Reply-To: References: Message-ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193696 --- Comment #4 from Bryan Drewery --- Kib had some feedback on the assert: 1. We should also add it (and the interrupt check) to uma_zalloc_arg() (through 1 inline function) 2. The interrupt assert may be wrong since it is not OK to malloc(9) in an interrupt, regardless of the flags. Isilon's internal discussion was that we should add a debug stack output rather than an assert until all major cases are fixed. -- You are receiving this mail because: You are the assignee for the bug. From matthias.andree at gmx.de Sun Sep 21 08:35:05 2014 From: matthias.andree at gmx.de (Matthias Andree) Date: Sun, 21 Sep 2014 10:34:54 +0200 Subject: How to disable hard disk write cache? In-Reply-To: <54184484.1070304@delphij.net> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> Message-ID: <541E8DAE.8010407@gmx.de> Am 16.09.2014 um 16:09 schrieb Xin Li: > Modern SATA/SAS/SCSI devices usually comes with the capability of > tagged commands, allowing the OS to know when a write buffer is on > stable storage. With this, file systems can easily implement the > right semantics and recover from e.g. a power outage, etc. Yes, they *can easily implement* that, but which file systems in FreeBSD *actually do* that? Do we have a list which file systems are safe to use with WCE=1 as long as they do NCQ? And what do you need to do on those Samsung drives where NCQ is flakey (HD103SI for one)? From delphij at delphij.net Sun Sep 21 08:42:28 2014 From: delphij at delphij.net (Xin Li) Date: Sun, 21 Sep 2014 16:42:26 +0800 Subject: How to disable hard disk write cache? In-Reply-To: <541E8DAE.8010407@gmx.de> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> <541E8DAE.8010407@gmx.de> Message-ID: <541E8F72.4020202@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 9/21/14 4:34 PM, Matthias Andree wrote: > Am 16.09.2014 um 16:09 schrieb Xin Li: > >> Modern SATA/SAS/SCSI devices usually comes with the capability >> of tagged commands, allowing the OS to know when a write buffer >> is on stable storage. With this, file systems can easily >> implement the right semantics and recover from e.g. a power >> outage, etc. > > Yes, they *can easily implement* that, but which file systems in > FreeBSD *actually do* that? Both UFS (with Soft Updates, which is what it's all about; I haven't checked the Journalled Soft Updates case) and ZFS do that for ages. Cheers, -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJUHo9xAAoJEJW2GBstM+nsMo0P/2oyyJeH0LOvGGMIDJWw4CZf WGbEcecdQ3idVAw/2j/aYrlGpeYpGNZWi7dv5MOc3jdWRQbGkZ5r3E90YaSiwLac CRix+ovXeM+g7KSfb4q5NNrn9SEpQe8AklS5deTiFP3JxfV4juAf/aXUoCtrpdhm acVh0jYpel5ffdKDzXrHpKItvZ7t0YBg2qATK83+TGJ5cdefcEQnvKnOBMShyM4U ReOoeFO1y8ZBh9aVSYJlMarghxVXar212MavvLJxvKruQE++xf2lnhf8BkvCRiSd Xz24LZMJ/Cz92jMuYbCQ7sX4iIDkcCzpZeVOO8NjBE4YCc5XOCeIvw4uUdLwuWEz QqLLZYbepN1cjPraSrmNH8wFA6kZIQuZvMb3JJt6zRR8621VDg2IABAA4KyKs9CB 74shN5alKNCiDNf75EXfcZL30lTx8KDf/dPAil/mtYc0s7Orwc6veRDVmshCvSH+ y+f5NMYyOuVVoSrGNIhdP21c7vRyAywUVKgBT7IbnZg8I5ORNumCNcSFt+Q8c9Az sIC2cR2XVqCRQpH7qnfoot1e0FwUJEdyOHupVVMlTzFb55z06k79/w1/rruY4ylT rSunaKPy5xm4e43JxgzNhXobH6Qd0qO9VxB5nNb4By52GDccS5xIQw6v36TVXorn WfdQb0tZYKfnlE8tjCKH =MkPK -----END PGP SIGNATURE----- From matthias.andree at gmx.de Sun Sep 21 08:53:02 2014 From: matthias.andree at gmx.de (Matthias Andree) Date: Sun, 21 Sep 2014 10:52:52 +0200 Subject: How to disable hard disk write cache? In-Reply-To: <541E8F72.4020202@delphij.net> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> <541E8DAE.8010407@gmx.de> <541E8F72.4020202@delphij.net> Message-ID: <541E91E4.4030802@gmx.de> Am 21.09.2014 um 10:42 schrieb Xin Li: > On 9/21/14 4:34 PM, Matthias Andree wrote: >> Am 16.09.2014 um 16:09 schrieb Xin Li: > >>> Modern SATA/SAS/SCSI devices usually comes with the capability >>> of tagged commands, allowing the OS to know when a write buffer >>> is on stable storage. With this, file systems can easily >>> implement the right semantics and recover from e.g. a power >>> outage, etc. > >> Yes, they *can easily implement* that, but which file systems in >> FreeBSD *actually do* that? > > Both UFS (with Soft Updates, which is what it's all about; I haven't > checked the Journalled Soft Updates case) and ZFS do that for ages. Thank you. Is this in any of the 9.3 manpages so I could look that up rather than ask questions on the lists? :-) From delphij at delphij.net Sun Sep 21 09:16:01 2014 From: delphij at delphij.net (Xin Li) Date: Sun, 21 Sep 2014 17:15:58 +0800 Subject: How to disable hard disk write cache? In-Reply-To: <5418FA4A.1050707@gmail.com> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> <5418FA4A.1050707@gmail.com> Message-ID: <541E974E.7060702@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 9/17/14 11:04 AM, Xu Zhe wrote: > My final goal is not to disable write cache. I just want to do a > simple test to see how would the IOPS drop when write cache > disabled (I suppose without disk write cache, the value should be > the same as IOPS of random read, both using O_DIRECT flag to avoid > system cache, etc.). > > Meanwhile, what I really want to know is that, how could I make > sure IO is persistent as long as I got reply from hard disk? Since > I suppose all file systems require this assumption to achieve self > consistency. You should get consistent I/O results assuming you are doing the right benchmark, which unfortunately not all benchmarks do. Doing benchmarks without the on-device caches is interesting but IMHO not very useful, because it's likely no practical application would do that. Traditionally, these write caches are disabled only when you really care about data durability; with modern technologies, now you can get affordable low latency and battery/supercapacitor backed device, for example most enterprise grade SSDs, and use that as your write-ahead journalling device to cover that gap. In order to get more consistent results, you may choose a running data set that is large enough to defeat the cache effect. The size can be chosen by doing a sequence of tests and gradually increase your running set size, and eventually you would find an "inflicting point" where performance would drop significantly. Please note that typical hard drive do not offer consistent latency when you operate on different areas of the drive, and the results could vary because a lot of factors. To do meaningful benchmarks, these factors also need to be considered. > I have googled about the "tagged command" but only found something > related to TCQ, which mainly talks about the queueing but not > anything related to cache sync. Any more hints? Hrm I don't have much thing off hand, but these are defined by the standards. Alexander knows much more than I do in this area. > I saw some pages that mentioned about SATA FUA command support on > Linux (Which I guess is what I am looking for): > > https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt > > However, I have not found useful information related to Freebsd. > :( Probably g_bio(9) but the documentation do not have much detail on driver implementation, and callers of the API expects the driver to do all the right things. This is if you want to implement a file system, where you would go. But back to your question, if you want to know e.g., "how to disable write cache on an AHCI connected drive", you would go to the individual driver's manual, e.g. ada(4). Cheers, -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJUHpdOAAoJEJW2GBstM+nsRPwP/1FzatW89mmXIdGPYk9IZ+Bo Qdlqvtxy0MCZOJQvug3jUKkiKGxr3+7boo1juQCKgMBqfJzPPhwAef3pRDteuPV4 VGTItEYAKWDo8JXnEl07ChAyxYMiCikiTkplaikOOa5gfmPdyPc9/a0myJ9WQDGk v6D0flS6s4+ZSRolLAMvSRtzpIAnDUIrU4FjJX+D4Mygz1hrHoYigzic3AahCOIm PhQe1nW6cPOjJASknFSFs450cDrQZ2QeVfH2Kr+K49ULdX+dn7ot3etJ4Fw+EBUN m66yPpgjssj/LqXS3m7XLJJYOOBy5FAeBrVUKzGQ6FxkfoCG8+kkD7YsQ2CJZPkc ILLIshVSL35OchuuMJpG9BGcr2BaaUQ23D4d72nIDm+ktsXiLkyD1XbZm7Ze+bR6 aW2KT3t8dU8NhJi8iyoWMS3bksh6lC5k07g8nqAsUNI+vAlKsnbT3cJJuowzbNLc 5Ai6L2+qFIst97BhTWxlpyBWkJM4K2lzB3AhZyc0Y4uXYni4Uj2DPSrC8dxOrp36 rH3SPACSnRP2cyChWQ+vV3l/uNsWM9snqVFAXPrSjmdN0fzN2KzfxnUiK0PtaFyR DoZcwM590GWYianbmzHFPiVCrsqjS6y2z4GK1SVUEfa9VM2qahWUSHFNap1DC3W6 ab4I5sE2eO3BknRXpoMg =tb5a -----END PGP SIGNATURE----- From delphij at delphij.net Sun Sep 21 09:19:54 2014 From: delphij at delphij.net (Xin Li) Date: Sun, 21 Sep 2014 17:19:51 +0800 Subject: How to disable hard disk write cache? In-Reply-To: <541E91E4.4030802@gmx.de> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> <541E8DAE.8010407@gmx.de> <541E8F72.4020202@delphij.net> <541E91E4.4030802@gmx.de> Message-ID: <541E9837.3060407@delphij.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 9/21/14 4:52 PM, Matthias Andree wrote: > Am 21.09.2014 um 10:42 schrieb Xin Li: >> On 9/21/14 4:34 PM, Matthias Andree wrote: >>> Am 16.09.2014 um 16:09 schrieb Xin Li: >> >>>> Modern SATA/SAS/SCSI devices usually comes with the >>>> capability of tagged commands, allowing the OS to know when a >>>> write buffer is on stable storage. With this, file systems >>>> can easily implement the right semantics and recover from >>>> e.g. a power outage, etc. >> >>> Yes, they *can easily implement* that, but which file systems >>> in FreeBSD *actually do* that? >> >> Both UFS (with Soft Updates, which is what it's all about; I >> haven't checked the Journalled Soft Updates case) and ZFS do that >> for ages. > > Thank you. Is this in any of the 9.3 manpages so I could look that > up rather than ask questions on the lists? :-) I don't really know... I personally see it as a vital feature for any file systems (or otherwise they become the cat who ate our homework :) so I doubt if it's actually being explicitly documented. This is probably a good FAQ candidate though. Cheers, -----BEGIN PGP SIGNATURE----- iQIbBAEBCgAGBQJUHpg3AAoJEJW2GBstM+nsQDoP+IFrhCT+1LEfDOQYcC29XRi3 SHwrkoMOXgnvw0/h0xTGxQ9pLDl9xLIwTmaWdFNtHM1hYUI1XzZC96GuNSEQfxuR G6VPIsj6mb/sE0+HphZmJHsM3QG2gC9aiFtMr6DggBa4bWokFpjWIigmKP4z5eFD 9a+q4P5tZFIJUoKeDHEyxDdaoxXRWcD7528ALF7MueJXj0Ej1ErPdr4yAi63YUar VrqGqPSRnyEf5h/wak5UGy4ETXZmIScAONYDjI7bwZaJbgO/3qn3AqiV+xVtPJol EVW8b3LjMuV0/d+aym02eVqb4CMPx8JE2Zb8cPl9oVBI386ej9NKzVQVrbL9SC+N Ogm4ScYQxYBI6k7fKg7L2Iy9K2zYHAIdBSRlFjVRipn8O8wt8yrTq0lZYxLpykIY g1+lVeguKZtntF1jg0r4Bnw0+qhJczXf/Xu7icnWIcOjzcAKd+cIXjHzb/0Jh1ZN RM7hEyR7aHW5nwwZPfpV+Q5KWCflc9XgUxxWVVgzXkWDhDMuCpTSGBi/ohjbgPs7 I0cj4IBH+S7fTLuRrkYyUaUg3Sqe3Z3A9Tkyr2q4ePq6RctigcDjVHfbFNpL10b0 PQfF2IqprcMdGoiqIZoQ82qpwSNizFKVzv7vLQnWvFd2Fr3Gh2XQbXWESDo33QfR VMxnrmQ8xrfBxX2Bn9g= =ZkPk -----END PGP SIGNATURE----- From mav at FreeBSD.org Sun Sep 21 09:53:39 2014 From: mav at FreeBSD.org (Alexander Motin) Date: Sun, 21 Sep 2014 12:53:34 +0300 Subject: How to disable hard disk write cache? In-Reply-To: <541E974E.7060702@delphij.net> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> <5418FA4A.1050707@gmail.com> <541E974E.7060702@delphij.net> Message-ID: <541EA01E.6020600@ixsystems.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 21.09.2014 12:15, Xin Li wrote: > On 9/17/14 11:04 AM, Xu Zhe wrote: >> I have googled about the "tagged command" but only found >> something related to TCQ, which mainly talks about the queueing >> but not anything related to cache sync. Any more hints? > > Hrm I don't have much thing off hand, but these are defined by the > standards. Alexander knows much more than I do in this area. > >> I saw some pages that mentioned about SATA FUA command support >> on Linux (Which I guess is what I am looking for): > >> https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt > >> However, I have not found useful information related to >> Freebsd. :( > > Probably g_bio(9) but the documentation do not have much detail on > driver implementation, and callers of the API expects the driver to > do all the right things. This is if you want to implement a file > system, where you would go. > > But back to your question, if you want to know e.g., "how to > disable write cache on an AHCI connected drive", you would go to > the individual driver's manual, e.g. ada(4). I see there was several different topics touched in this threads, so let me start from the beginning. There are several methods to control disk caches, and respectively several approaches to maintaining the data consistency: - Caches can be controlled globally SCSI disks allow to control write cache via the Caching mode page. You may do it with `camcontrol mode` command. ATA disks has specific commands for that, and easiest way to do it is using sysctls/tunables documented in ada(4) manual page. Disabling caches globally heavily affects performance, and in most cases is overkill. It means that _every_ request will go to the media before the operation complete. For disks without command queuing (like legacy ATA) that usually meant _one_ I/O request per platter revolution. Do you want disk doing 120 IOPS peak? If you write huge file in 128K chunks, you will get limited by 120/s * 128K = 15MB/s! Command queuing (NCQ for SATA) significantly improved this situation since OS can now send more operations down to the disk to give it more flexibility, in significant part compensating disabled cache. But number of simultaneously active tags in disk, HBA or application can be limited, creating delays. - Caches can be controlled per-command SCSI disks may support FUA and DPO flags, that allow to do above cache control on per-command basis. SATA disks got FUA flag just recently. This approach has all properties of above, just can be controller per data type or application. Those flags are equivalent of IO_SYNC and IO_DIRECT flags on VFS layer of FreeBSD. Though FreeBSD block layer never had their equivalents. I've heard that Windows NTFS used this technique to keep metadata consistency up to some point, but moved away from it. - Caches can be flushed to media with SYNCHRONIZE CACHE commands Both SATA and SCSI disks provide ways to flush write caches to the media on request. It allows disks to do many writes in any order they prefer within one transaction group, but then reliably push everything to the media before when the group being committed. FreeBSD supports that on both VFS (fflush/VOP_FSYNC) and block layers (BIO_FLUSH). ZFS on FreeBSD relies on it to keep transaction groups atomicity and so metadata correctness. I've heard, this is what Windows NTFS is doing now too. I hope that answered most of questions. - -- Alexander Motin -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQF8BAEBCgBmBQJUHqAeXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRFOThDRjNDNEU2OUNDM0NEMEU1NzlENTU4 MzE4QzM5NTVCQUIyMjdGAAoJEIMYw5VbqyJ/E6UH+QF5AU3KNKjBLyNsgGTOn+UV SZAj3V3JLUNM4Z+8Z+4rxY/26/nbfNPoEJD4SaOt74oEUA574XjVLkHmqZ5JKygV dzOE2m8t0gyDIPurGx2CfRyG7UdJKbVsJ7Ebxafd8lRbwHGoObySUmew8t8WSIHA zBsI3ZiRYzBX7NnejPlJ8UPh498PBl78U+Ak08q/0scdPWsCneCjqHHn0Rx2MEFp 7sllphFh+EBXJXqetFRJfuWmm7yfIrzjO5UJ1mTjq5dCSeXrsIxBMBTSMEG34Yv6 9PqPWYPdVWQKgluKcaTKgrzPVTBgAqefx4gHRm/460a+7qohloCelMsgAsttYuA= =De3m -----END PGP SIGNATURE----- From borjam at sarenet.es Mon Sep 22 15:47:01 2014 From: borjam at sarenet.es (Borja Marcos) Date: Mon, 22 Sep 2014 17:39:58 +0200 Subject: mpr vs mps performance Message-ID: Hello, I have been playing with the new SAS3 cards supported by the mpr driver, and I?ve found out that they are, in the same hardware configuration, considerably slower writing data. Moreover, running two simultaneous "bonnie" benchmarks (I am using SSDs, and one "bonnie" sometimes hits 100% CPU usage, unable to really saturate the I/O) I see the writing activity somewhat stalling, with disk bandwidth going from 600 MB/s to around 50 for 20 seconds or so. I'd like to know if this matches anyone else's experiences. Also, I can try and make some tests if needed. But for now it seems we will stick to the SAS2 HBAs. The Bonnie results are: With mpr driver, SAS3: (each bonnie instance, so multiply the results by 2 to get the actual bandwidth achieved) Seq output: (writing) Block: 292155 KB/s Rewrite: 139713 KB/s Seq input: Block: 862861 KB/s With mps driver: SAS2, again, total is 2x the following figures. Seq output: (writing) Block: 587950 KB/s Rewrite: 208239 KB/s Seq. input: (reading) Block: 842169 KB/s The storage is a ZFS pool with a 9-disk raidz2 vdev, made of Samsung 840 EVO 1 TB SSDs. The pool has been created with an ashift of 12 (zpool applied it thanks to the 4 KB block quirk for these SSDs) at scbus0 target 9 lun 0 (pass0,da0) at scbus0 target 10 lun 0 (pass1,da1) at scbus0 target 11 lun 0 (pass2,da2) at scbus0 target 12 lun 0 (pass3,da3) at scbus0 target 13 lun 0 (pass4,da4) at scbus0 target 14 lun 0 (pass5,da5) at scbus0 target 17 lun 0 (pass7,da6) at scbus0 target 18 lun 0 (pass8,da7) at scbus0 target 27 lun 0 (pass14,da12) The mpr card details follow: Sep 17 09:49:39 elibm kernel: mpr0: port 0x3f00-0x3fff mem 0x912f0000-0x912fffff irq 32 at device 0.0 on pci17 Sep 17 09:49:39 elibm kernel: mpr0: IOCFacts : Sep 17 09:49:39 elibm kernel: MsgVersion: 0x205 Sep 17 09:49:39 elibm kernel: HeaderVersion: 0x1d00 Sep 17 09:49:39 elibm kernel: IOCNumber: 0 Sep 17 09:49:39 elibm kernel: IOCExceptions: 0x0 Sep 17 09:49:39 elibm kernel: MaxChainDepth: 128 Sep 17 09:49:39 elibm kernel: NumberOfPorts: 1 Sep 17 09:49:39 elibm kernel: RequestCredit: 11264 Sep 17 09:49:39 elibm kernel: ProductID: 0x2221 Sep 17 09:49:39 elibm kernel: IOCRequestFrameSize: 32 Sep 17 09:49:39 elibm kernel: MaxInitiators: 1 Sep 17 09:49:39 elibm kernel: MaxTargets: 1024 Sep 17 09:49:39 elibm kernel: MaxSasExpanders: 14 Sep 17 09:49:39 elibm kernel: MaxEnclosures: 15 Sep 17 09:49:39 elibm kernel: HighPriorityCredit: 60 Sep 17 09:49:39 elibm kernel: MaxReplyDescriptorPostQueueDepth: 65504 Sep 17 09:49:39 elibm kernel: ReplyFrameSize: 32 Sep 17 09:49:39 elibm kernel: MaxVolumes: 0 Sep 17 09:49:39 elibm kernel: MaxDevHandle: 1047 Sep 17 09:49:39 elibm kernel: MaxPersistentEntries: 128 Sep 17 09:49:39 elibm kernel: mpr0: Firmware: 01.00.03.00, Driver: 05.255.05.00-fbsd Sep 17 09:49:39 elibm kernel: mpr0: IOCCapabilities: 3a85c And the mps card is a classic: Sep 22 17:18:24 elibm kernel: mps0: port 0x3f00-0x3fff mem 0x90ebc000-0x90ebffff,0x912c0000-0x912fffff irq 32 at device 0.0 on pci17 Sep 22 17:18:24 elibm kernel: mps0: Firmware: 18.00.00.00, Driver: 19.00.00.00-fbsd Sep 22 17:18:24 elibm kernel: mps0: IOCCapabilities: 1285c The connected devices follow. Both use the same hardware (except for the cables and HBA of course), but currently there's no way to check this with the SAS3 card, as sas3ircu nor sas3flash detect it on FreeBSD. # sas2ircu 0 display LSI Corporation SAS2 IR Configuration Utility. Version 18.00.00.00 (2013.11.18) Copyright (c) 2009-2013 LSI Corporation. All rights reserved. Read configuration has been initiated for controller 0 ------------------------------------------------------------------------ Controller information ------------------------------------------------------------------------ Controller type : SAS2008 BIOS version : 7.35.00.00 Firmware version : 18.00.00.00 Channel description : 1 Serial Attached SCSI Initiator ID : 0 Maximum physical devices : 255 Concurrent commands supported : 3432 Slot : 3 Segment : 0 Bus : 17 Device : 0 Function : 0 RAID Support : No ------------------------------------------------------------------------ IR Volume information ------------------------------------------------------------------------ ------------------------------------------------------------------------ Physical device information ------------------------------------------------------------------------ Initiator at ID #0 Device is a Hard disk Enclosure # : 2 Slot # : 16 SAS Address : 5000c50-0-05b5-ce25 State : Ready (RDY) Size (in MB)/(in sectors) : 140014/286749479 Manufacturer : SEAGATE Model Number : ST9146803SS Firmware Revision : FS03 Serial No : 3SD02W5L GUID : N/A Protocol : SAS Drive Type : SAS_HDD Device is a Hard disk Enclosure # : 2 Slot # : 17 SAS Address : 5005076-0-3e8e-81a2 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08549F GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 2 Slot # : 18 SAS Address : 5005076-0-3e8e-81a3 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08548T GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 2 Slot # : 19 SAS Address : 5005076-0-3e8e-81a4 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08568E GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 2 Slot # : 20 SAS Address : 5005076-0-3e8e-81a5 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08547X GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 2 Slot # : 21 SAS Address : 5005076-0-3e8e-81a6 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08518Y GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 2 Slot # : 22 SAS Address : 5005076-0-3e8e-81a7 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08556K GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Enclosure services device Enclosure # : 2 Slot # : 255 SAS Address : 5005076-0-3e8e-81b9 State : Standby (SBY) Manufacturer : IBM-ESXS Model Number : SAS EXP BP Firmware Revision : 61A6 Serial No : 00000006 GUID : N/A Protocol : SAS Device Type : Enclosure services device Device is a Hard disk Enclosure # : 3 Slot # : 0 SAS Address : 5005076-0-3e8e-86e9 State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08550R GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 3 Slot # : 1 SAS Address : 5005076-0-3e8e-86ea State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08911Y GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 3 Slot # : 2 SAS Address : 5005076-0-3e8e-86eb State : Ready (RDY) Size (in MB)/(in sectors) : 953869/1953525167 Manufacturer : ATA Model Number : Samsung SSD 840 Firmware Revision : BB0Q Serial No : S1D9NEADA08811L GUID : N/A Protocol : SATA Drive Type : SATA_SSD Device is a Hard disk Enclosure # : 3 Slot # : 13 SAS Address : 5000c50-0-05b5-e531 State : Ready (RDY) Size (in MB)/(in sectors) : 140014/286749479 Manufacturer : SEAGATE Model Number : ST9146803SS Firmware Revision : FS03 Serial No : 3SD02STR GUID : N/A Protocol : SAS Drive Type : SAS_HDD Device is a Hard disk Enclosure # : 3 Slot # : 14 SAS Address : 5000c50-0-05b5-d489 State : Ready (RDY) Size (in MB)/(in sectors) : 140014/286749479 Manufacturer : SEAGATE Model Number : ST9146803SS Firmware Revision : FS03 Serial No : 3SD02TV1 GUID : N/A Protocol : SAS Drive Type : SAS_HDD Device is a Hard disk Enclosure # : 3 Slot # : 15 SAS Address : 5000c50-0-05b5-f0ad State : Ready (RDY) Size (in MB)/(in sectors) : 140014/286749479 Manufacturer : SEAGATE Model Number : ST9146803SS Firmware Revision : FS03 Serial No : 3SD03F4C GUID : N/A Protocol : SAS Drive Type : SAS_HDD Device is a Enclosure services device Enclosure # : 3 Slot # : 255 SAS Address : 5005076-0-3e8e-86f9 State : Standby (SBY) Manufacturer : IBM-ESXS Model Number : SAS EXP BP Firmware Revision : 61A6 Serial No : 00000006 GUID : N/A Protocol : SAS Device Type : Enclosure services device ------------------------------------------------------------------------ Enclosure information ------------------------------------------------------------------------ Enclosure# : 1 Logical ID : 500605b0:07ba2100 Numslots : 8 StartSlot : 0 Enclosure# : 2 Logical ID : 50050760:3e8e81a0 Numslots : 25 StartSlot : 0 Enclosure# : 3 Logical ID : 50050760:3e8e86e0 Numslots : 25 StartSlot : 0 ------------------------------------------------------------------------ SAS2IRCU: Command DISPLAY Completed Successfully. SAS2IRCU: Utility Completed Successfully. From bugzilla-noreply at freebsd.org Mon Sep 22 23:10:49 2014 From: bugzilla-noreply at freebsd.org (bugzilla-noreply at freebsd.org) Date: Mon, 22 Sep 2014 23:10:49 +0000 Subject: [Bug 193696] CAM AC_FOUND_DEVICE calls malloc(M_WAITOK) from THREAD_NO_SLEEPING() context In-Reply-To: References: Message-ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193696 --- Comment #5 from Garrett Cooper --- The patch Scott provided looked ok (doesn't panic on boot with simple cases with a VM), but I didn't get an opportunity to test it out more extensively. I didn't try out a patch that incorporates the feedback noted in comment # 4 yet. -- You are receiving this mail because: You are the assignee for the bug. From xzpeter at gmail.com Tue Sep 23 08:11:37 2014 From: xzpeter at gmail.com (Xu Zhe) Date: Tue, 23 Sep 2014 16:11:29 +0800 Subject: How to disable hard disk write cache? In-Reply-To: <541EA01E.6020600@ixsystems.com> References: <541804B0.7070407@gmail.com> <54184484.1070304@delphij.net> <5418FA4A.1050707@gmail.com> <541E974E.7060702@delphij.net> <541EA01E.6020600@ixsystems.com> Message-ID: <54212B31.7050608@gmail.com> Hi, Alexandar, Xin, This is exactly what I am looking for. Thanks alot. :) Peter ? 14-9-21 17:53, Alexander Motin ??: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > On 21.09.2014 12:15, Xin Li wrote: >> On 9/17/14 11:04 AM, Xu Zhe wrote: >>> I have googled about the "tagged command" but only found >>> something related to TCQ, which mainly talks about the queueing >>> but not anything related to cache sync. Any more hints? >> Hrm I don't have much thing off hand, but these are defined by the >> standards. Alexander knows much more than I do in this area. >> >>> I saw some pages that mentioned about SATA FUA command support >>> on Linux (Which I guess is what I am looking for): >>> https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt >>> However, I have not found useful information related to >>> Freebsd. :( >> Probably g_bio(9) but the documentation do not have much detail on >> driver implementation, and callers of the API expects the driver to >> do all the right things. This is if you want to implement a file >> system, where you would go. >> >> But back to your question, if you want to know e.g., "how to >> disable write cache on an AHCI connected drive", you would go to >> the individual driver's manual, e.g. ada(4). > I see there was several different topics touched in this threads, so > let me start from the beginning. There are several methods to control > disk caches, and respectively several approaches to maintaining the > data consistency: > > - Caches can be controlled globally > SCSI disks allow to control write cache via the Caching mode page. You > may do it with `camcontrol mode` command. ATA disks has specific > commands for that, and easiest way to do it is using sysctls/tunables > documented in ada(4) manual page. Disabling caches globally heavily > affects performance, and in most cases is overkill. It means that > _every_ request will go to the media before the operation complete. > For disks without command queuing (like legacy ATA) that usually meant > _one_ I/O request per platter revolution. Do you want disk doing 120 > IOPS peak? If you write huge file in 128K chunks, you will get limited > by 120/s * 128K = 15MB/s! Command queuing (NCQ for SATA) significantly > improved this situation since OS can now send more operations down to > the disk to give it more flexibility, in significant part compensating > disabled cache. But number of simultaneously active tags in disk, HBA > or application can be limited, creating delays. > > - Caches can be controlled per-command > SCSI disks may support FUA and DPO flags, that allow to do above cache > control on per-command basis. SATA disks got FUA flag just recently. > This approach has all properties of above, just can be controller per > data type or application. Those flags are equivalent of IO_SYNC and > IO_DIRECT flags on VFS layer of FreeBSD. Though FreeBSD block layer > never had their equivalents. I've heard that Windows NTFS used this > technique to keep metadata consistency up to some point, but moved > away from it. > > - Caches can be flushed to media with SYNCHRONIZE CACHE commands > Both SATA and SCSI disks provide ways to flush write caches to the > media on request. It allows disks to do many writes in any order they > prefer within one transaction group, but then reliably push everything > to the media before when the group being committed. FreeBSD supports > that on both VFS (fflush/VOP_FSYNC) and block layers (BIO_FLUSH). ZFS > on FreeBSD relies on it to keep transaction groups atomicity and so > metadata correctness. I've heard, this is what Windows NTFS is doing > now too. > > I hope that answered most of questions. > > - -- > Alexander Motin > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2 > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQF8BAEBCgBmBQJUHqAeXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w > ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRFOThDRjNDNEU2OUNDM0NEMEU1NzlENTU4 > MzE4QzM5NTVCQUIyMjdGAAoJEIMYw5VbqyJ/E6UH+QF5AU3KNKjBLyNsgGTOn+UV > SZAj3V3JLUNM4Z+8Z+4rxY/26/nbfNPoEJD4SaOt74oEUA574XjVLkHmqZ5JKygV > dzOE2m8t0gyDIPurGx2CfRyG7UdJKbVsJ7Ebxafd8lRbwHGoObySUmew8t8WSIHA > zBsI3ZiRYzBX7NnejPlJ8UPh498PBl78U+Ak08q/0scdPWsCneCjqHHn0Rx2MEFp > 7sllphFh+EBXJXqetFRJfuWmm7yfIrzjO5UJ1mTjq5dCSeXrsIxBMBTSMEG34Yv6 > 9PqPWYPdVWQKgluKcaTKgrzPVTBgAqefx4gHRm/460a+7qohloCelMsgAsttYuA= > =De3m > -----END PGP SIGNATURE----- From kylie.liang2 at outlook.com Wed Sep 24 14:41:26 2014 From: kylie.liang2 at outlook.com (Kylie Liang) Date: Wed, 24 Sep 2014 14:40:19 +0000 Subject: scatter/gather list support Message-ID: On http://www.freebsd.org/doc/en/books/arch-handbook/scsi-general.html, it mentions " Actually, it seems like the scatter-gather ability is not used anywhere in the CAM code now. But at least the case for a single non-scattered virtual buffer must be implemented, it is actively used by CAM.". What does it mean? How does CAM level handle scatter-gather I/O request? Thank you. From bugzilla-noreply at freebsd.org Thu Sep 25 00:58:32 2014 From: bugzilla-noreply at freebsd.org (bugzilla-noreply at freebsd.org) Date: Thu, 25 Sep 2014 00:58:32 +0000 Subject: [Bug 193696] CAM AC_FOUND_DEVICE calls malloc(M_WAITOK) from THREAD_NO_SLEEPING() context In-Reply-To: References: Message-ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193696 --- Comment #6 from Bryan Drewery --- Trace from xpt_done_td from pulling a device out of the system: KASSERT failed: malloc(M_WAITOK) in no_sleeping context KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe349829a340 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe349829a3f0 _kassert_panic() at _kassert_panic+0xd7/frame 0xfffffe349829a470 malloc() at malloc+0x2e4/frame 0xfffffe349829a4c0 g_post_event_x() at g_post_event_x+0x84/frame 0xfffffe349829a510 g_post_event() at g_post_event+0x5d/frame 0xfffffe349829a580 adacleanup() at adacleanup+0x62/frame 0xfffffe349829a5a0 cam_periph_release_locked_buses() at cam_periph_release_locked_buses+0xde/frame 0xfffffe349829aaa0 cam_periph_release_locked() at cam_periph_release_locked+0x1b/frame 0xfffffe349829aac0 adadone() at adadone+0x26e/frame 0xfffffe349829ab20 xpt_done_process() at xpt_done_process+0x3a4/frame 0xfffffe349829ab60 xpt_done_td() at xpt_done_td+0x136/frame 0xfffffe349829abb0 fork_exit() at fork_exit+0x84/frame 0xfffffe349829abf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe349829abf0 -- You are receiving this mail because: You are the assignee for the bug. From bugzilla-noreply at freebsd.org Thu Sep 25 01:09:22 2014 From: bugzilla-noreply at freebsd.org (bugzilla-noreply at freebsd.org) Date: Thu, 25 Sep 2014 01:09:22 +0000 Subject: [Bug 193696] CAM AC_FOUND_DEVICE calls malloc(M_WAITOK) from THREAD_NO_SLEEPING() context In-Reply-To: References: Message-ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193696 --- Comment #7 from Bryan Drewery --- https://reviews.freebsd.org/D829 - KASSERT_WARN https://reviews.freebsd.org/D830 - Use KASSERT_WARN in malloc(9) and uma_zalloc_arg(9) -- You are receiving this mail because: You are the assignee for the bug.