RFC: GEOM MULTIPATH rewrite
mav at FreeBSD.org
Tue Nov 1 13:05:43 UTC 2011
On 11/01/11 14:39, Pawel Jakub Dawidek wrote:
> On Mon, Oct 31, 2011 at 10:10:14PM +0200, Alexander Motin wrote:
>> Attempt to fix some GEOM MULTIPATH issues made me almost rewrite it. So
>> I would like to present my results and request for testing and feedback.
>> The main changes:
>> - Improved locking and destruction process to fix crashes in many cases.
>> - Improved "automatic" configuration method to make it safe by reading
>> metadata back from all specified paths after writing to one.
>> - Added provider size check to reduce chance of conflict with other
>> GEOM classes.
>> - Added "manual" configuration method without using on-disk metadata.
>> - Added "add" and "remove" commands to manage paths manually.
>> - Failed paths no longer dropped from GEOM, but only marked as FAIL and
>> excluded from I/O operations.
>> - Automatically restore failed paths when all others paths are marked
>> as failed, for example, because of device-caused (not transport) errors.
>> - Added "fail" and "restore" commands to manually control FAIL flag.
>> - GEOM is now destroyed on last provider disconnection. IMHO it is
>> right to do if device was completely removed.
>> - Added optional Active/Active mode support. Unlike Active/Passive
>> mode, load evenly distributed between all working paths. If supported by
>> device, it allows to significantly improve performance, utilizing
>> bandwidth of all paths. It is controlled by -A option during creation.
>> Disabled by default now.
>> - Improved `status` and `list` commands output.
>> Latest patch can be found here:
>> Feedbacks are welcome!
>> Sponsored by: iXsystems, Inc.
> There are two possible issues that comes to my mind, not sure if you
> address them.
> 1. When configuration is based on on-disk metadata, GEOM spoil/taste is
> not fully helpful - if you have two paths: da0 and da1 and I write
> to da0, gmultipath won't be informed by GEOM that da1 changed as well.
> One solution is to basically keep all paths open exclusively all the
> time, even if gmultipath provider is not open or emulate spoil/taste
> for other paths if any path was modified.
Now I am opening all underlying providers exclusively on attach to spoil
other consumers. It is configurable via sysctl.
> 2. In active/active mode do you do anything to handle possible
> reordering? Ie. if you have overlapping writes and send both of them
> using different paths, you cannot be sure that order will be
> preserved. Most of the time that's not a problem, as file systems
> rarely if at all send overlapping writes to device, but this is weak
No, I don't. I have doubt that it is sane to send even dependent I/O
simultaneously without waiting for completion, not speaking about
overlapping. When most of present devices support command queuing and so
officially justify reordering simultaneous commands in custom way, I am
not sure why above layers should be more strict, especially in cases
when it is problematic. If somebody have ideas why and how to implement
it, I am ready to discuss.
More information about the freebsd-current