RFC: GEOM MULTIPATH rewrite

Alexander Motin mav at FreeBSD.org
Tue Nov 1 13:05:43 UTC 2011


On 11/01/11 14:39, Pawel Jakub Dawidek wrote:
> On Mon, Oct 31, 2011 at 10:10:14PM +0200, Alexander Motin wrote:
>> Attempt to fix some GEOM MULTIPATH issues made me almost rewrite it. So
>> I would like to present my results and request for testing and feedback.
>>
>> The main changes:
>>  - Improved locking and destruction process to fix crashes in many cases.
>>  - Improved "automatic" configuration method to make it safe by reading
>> metadata back from all specified paths after writing to one.
>>  - Added provider size check to reduce chance of conflict with other
>> GEOM classes.
>>  - Added "manual" configuration method without using on-disk metadata.
>>  - Added "add" and "remove" commands to manage paths manually.
>>  - Failed paths no longer dropped from GEOM, but only marked as FAIL and
>> excluded from I/O operations.
>>  - Automatically restore failed paths when all others paths are marked
>> as failed, for example, because of device-caused (not transport) errors.
>>  - Added "fail" and "restore" commands to manually control FAIL flag.
>>  - GEOM is now destroyed on last provider disconnection. IMHO it is
>> right to do if device was completely removed.
>>  - Added optional Active/Active mode support. Unlike Active/Passive
>> mode, load evenly distributed between all working paths. If supported by
>> device, it allows to significantly improve performance, utilizing
>> bandwidth of all paths. It is controlled by -A option during creation.
>> Disabled by default now.
>>  - Improved `status` and `list` commands output.
>>
>> Latest patch can be found here:
>> http://people.freebsd.org/~mav/gmultipath4.patch
>>
>> Feedbacks are welcome!
>>
>> Sponsored by: iXsystems, Inc.
> 
> There are two possible issues that comes to my mind, not sure if you
> address them.
> 
> 1. When configuration is based on on-disk metadata, GEOM spoil/taste is
>    not fully helpful - if you have two paths: da0 and da1 and I write
>    to da0, gmultipath won't be informed by GEOM that da1 changed as well.
>    One solution is to basically keep all paths open exclusively all the
>    time, even if gmultipath provider is not open or emulate spoil/taste
>    for other paths if any path was modified.

Now I am opening all underlying providers exclusively on attach to spoil
other consumers. It is configurable via sysctl.

> 2. In active/active mode do you do anything to handle possible
>    reordering? Ie. if you have overlapping writes and send both of them
>    using different paths, you cannot be sure that order will be
>    preserved. Most of the time that's not a problem, as file systems
>    rarely if at all send overlapping writes to device, but this is weak
>    assumption.

No, I don't. I have doubt that it is sane to send even dependent I/O
simultaneously without waiting for completion, not speaking about
overlapping. When most of present devices support command queuing and so
officially justify reordering simultaneous commands in custom way, I am
not sure why above layers should be more strict, especially in cases
when it is problematic. If somebody have ideas why and how to implement
it, I am ready to discuss.

-- 
Alexander Motin


More information about the freebsd-current mailing list