multipath-tools/.git
6 years agomultipath: do not print 'path is up' for removed paths
Hannes Reinecke [Fri, 13 Dec 2013 12:12:40 +0000 (13:12 +0100)]
multipath: do not print 'path is up' for removed paths

When a path is removed the previous checker message is still
kept in the checker context, and will be printed upon each
check. This causes multipath to print out
'path is up'
even though it already has been removed from sysfs.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agolibmultipath: Fix typo in retain_attached_hw_handler
Hannes Reinecke [Fri, 13 Dec 2013 12:12:39 +0000 (13:12 +0100)]
libmultipath: Fix typo in retain_attached_hw_handler

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agoDocument 'wwids_file' and 'reservation_key'
Hannes Reinecke [Fri, 13 Dec 2013 12:12:38 +0000 (13:12 +0100)]
Document 'wwids_file' and 'reservation_key'

Add documentation for 'wwids_file' and 'reservation_key' to
multipath.conf.annotated.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agoRemove trailing spaces from sysfs attributes
Hannes Reinecke [Fri, 13 Dec 2013 12:12:37 +0000 (13:12 +0100)]
Remove trailing spaces from sysfs attributes

Some sysfs attributes may contain trailing spaces, which only
serve to confuse matters. So strip them before continuing.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomultipathd: Implement systemd watchdog integration
Hannes Reinecke [Fri, 13 Dec 2013 09:11:19 +0000 (10:11 +0100)]
multipathd: Implement systemd watchdog integration

In the past there have been several instances where multipathd
would hang with the checkerloop as some path checker might not
be able to return in time.
This patch now activates the watchdog feature from systemd
to shutdown (and possibly restart) multipathd in these
situations.
Due to a bug in systemd watchdog integration only works
correctly with later version (> 206), so watchdog integration
has been disabled per default.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agoUse correct systemd unit directory
Hannes Reinecke [Fri, 13 Dec 2013 09:11:18 +0000 (10:11 +0100)]
Use correct systemd unit directory

The systemd unit directory has been moved to /usr/lib/systemd.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomultipathd: enable core dumps for systemd
Hannes Reinecke [Fri, 13 Dec 2013 09:11:17 +0000 (10:11 +0100)]
multipathd: enable core dumps for systemd

Add 'LimitCORE' definition to the service file to enable core
dumps when running under systemd.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agokpartx: autoload loop module on loop partition creation
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:10 +0000 (00:43 -0600)]
kpartx: autoload loop module on loop partition creation

Currently kpartx doesn't do anything to force the loop module to
autoload, so creating partitions over files fails if the loop module
isn't already loaded.  This patch makes kpartx try to find the next
available loop device by ioctling /dev/loop-control, which will
autoload kpartx.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agomultipathd: allow /dev/<devnode> to be used for multipathd commands
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:09 +0000 (00:43 -0600)]
multipathd: allow /dev/<devnode> to be used for multipathd commands

Multipathd expects that path and map names used in its interactive
commands are sysfs names with no directory.  However, users often try
used /dev names instead.  This patch makes multipathd convert the /dev
names to sysfs names like the multipath command does.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agolibmultipath: blacklist blktap devices
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:08 +0000 (00:43 -0600)]
libmultipath: blacklist blktap devices

Multipath can't run on top of blktap devices, so it should just ignore them.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agokpartx: fix strict aliasing warning
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:07 +0000 (00:43 -0600)]
kpartx: fix strict aliasing warning

Compiling with strict aliasing throws warnings about aliasing
volume_label_t to an array of unsigned ints. Adding __may_alias__
lets it know that we meant to do this.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agomultipathd: Don't touch the multipath device after setup_multipath fails
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:06 +0000 (00:43 -0600)]
multipathd: Don't touch the multipath device after setup_multipath fails

If setup_multipath fails, it removes the device. multipath always needs
to check its return value and not touch the device if setup_multipath
failed.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agolibmultipath: show default configurations
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:05 +0000 (00:43 -0600)]
libmultipath: show default configurations

"multipathd show config" should show the current configuration, even if
the current values match the default values.  Omitting the values when
they match the compiled in defaults just makes it harder for users to
see how multipath configured.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agokpartx: Make kpartx able to create read-only loop devices
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:04 +0000 (00:43 -0600)]
kpartx: Make kpartx able to create read-only loop devices

This just passes the read-only value into set_loop, and falls back
to read-only mapping on EACCESS to handle immutable files.

Signed-off-by: Till Maas <opensource@till.name>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agolibmultipath: try to deal with bindings file hand editting
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:03 +0000 (00:43 -0600)]
libmultipath: try to deal with bindings file hand editting

Unfortunately, instead of adding aliases to /etc/multipath.conf like they
should, some users add aliases to the bindings file. If they add aliases
that look like user_friendly_names, like mpathfoo, they interfere with
the code to find the next available user_friendly_name (giving you
mpathfop as the next user_friendly_name in this case).  This patch
will fix this by giving the next name available name after an unbroken
string of names starting with the smallest possible name. It order to keep
the choice quick, it won't handle situation where the list of names
starting at the smallest possible is not in order.  There are a number
of ways to do a better job finding the smallest possible name, but this
will fix most of the cases I've seen, without slowing things down at all.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agofix mutipath -W on empty wwids file
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:01 +0000 (00:43 -0600)]
fix mutipath -W on empty wwids file

When multipath tries to open the wwids file and finds it empty or
missing, it writes the header to the file.  When it tried to wipe
the wwids from an empty or missing file, it didn't seek back to the
start of the file after truncating it.  This caused the the wwids
file to have a patch of zeroed bytes at the start. This patch fixes
this by always seeking back to the start of the file before rewriting
the header.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agosignal waiter thread to stop waiting on dm events
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:00 +0000 (00:43 -0600)]
signal waiter thread to stop waiting on dm events

The ioctl syscall is not a pthread cancellation point. So, when
device-mapper is waiting on events, pthread_cancel won't cancel
the waiter thread.  This patch makes the waiter threads handle
SIGUSR2, and has stop_waiter_thread() signal it to break out of
waiting on dm events. Unfortunately, there is still a possibility
of the signal arriving before the ioctl, and the waiter thread
still hanging until the devices are reloaded.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agoOnly load a multipath device RO if RW load fails with EROFS
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:59 +0000 (00:42 -0600)]
Only load a multipath device RO if RW load fails with EROFS

If a transient error keeps multipath from being able to reload a
multipath device read/write, it will retry the reload with the
device set to read-only.  This can suddenly turn a read/write
multipath device read-only.  If device-mapper cannot load the
device read/write because a path device can't be opened for writing,
it will return EROFS.  This is the only time it makes sense to
reload the device read-only.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agoDon't detect prioritizers that don't work
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:58 +0000 (00:42 -0600)]
Don't detect prioritizers that don't work

The current method used by detect_prio was selecting the ALUA
prioritizer for devices that didn't have ALUA enabled.  This patch
makes detect_prio go through all the steps to get a priority with the
ALUA prioritizer.  If it is able to successfully get a priority, then
it selects the ALUA proritizer for the device.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agoTurn off user_friendly_names for netapp devices
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:57 +0000 (00:42 -0600)]
Turn off user_friendly_names for netapp devices

Netapp has requested that even if user friendly names is enabled in the
defaults section, they would like it disabled for their devices.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agoFix multipath rename from user_friendly_name to wwid
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:56 +0000 (00:42 -0600)]
Fix multipath rename from user_friendly_name to wwid

When multipath was selecting an alias for a device on reload, if it
didn't have an explicit alias, and user_friendly_names wasn't set,
multipath would use the existing alias, if one existed.  This made it
impossible to turn off user_friendly_names, and then reconfigure to
change the device names back to wwids.

Instead, multipath should just use the wwid as an alias, if that's
what it's configured to do, regardless of the existing name.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years agoAdd multipath path format wildcard
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:55 +0000 (00:42 -0600)]
Add multipath path format wildcard

This adds a new format wildcard, 'm', to be used with

multipathd show paths format

It prints the multipath device associated with the path.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 years ago[multipathd] Declaration of envp variable in child function is missing
Mateusz Półrola [Wed, 11 Dec 2013 05:43:57 +0000 (06:43 +0100)]
[multipathd] Declaration of envp variable in child function is missing

6 years agomultipathd: measure path check time
Hannes Reinecke [Tue, 26 Nov 2013 11:41:30 +0000 (12:41 +0100)]
multipathd: measure path check time

Instrument code to measure path check time.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomultipathd: Read environment variables from systemd
Hannes Reinecke [Tue, 26 Nov 2013 11:41:29 +0000 (12:41 +0100)]
multipathd: Read environment variables from systemd

When systemd adjusts 'OOMScoreAdjust' and 'LimitNOFILE'
we should take those settings and not try to adjust them
again on our side.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomultipathd: use sd_notify() to inform systemd
Hannes Reinecke [Tue, 26 Nov 2013 11:41:26 +0000 (12:41 +0100)]
multipathd: use sd_notify() to inform systemd

Implement sd_notify() to inform systemd about our internal state.
And we should be using the service type 'notify' so the systemd
doesn't try to flood multipathd if it's still in discovery.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomultipathd: switch to socket activation for systemd
Hannes Reinecke [Tue, 26 Nov 2013 11:41:25 +0000 (12:41 +0100)]
multipathd: switch to socket activation for systemd

multipathd already has a netlink socket for CLI commands, which
can be used for socket activation.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomultipathd: Add option '-s' to suppress timestamps
Hannes Reinecke [Tue, 26 Nov 2013 11:41:24 +0000 (12:41 +0100)]
multipathd: Add option '-s' to suppress timestamps

systemd prefixes any messages to stdout with a timestamp, so it's
quite pointless to do it ourself.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agoUse system-provided regex implementation
Hannes Reinecke [Tue, 26 Nov 2013 11:41:23 +0000 (12:41 +0100)]
Use system-provided regex implementation

There is zero value in carrying our own (old) regex implementation
around; we're far better off using the system-provided one.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agoClarify uxsock logging
Hannes Reinecke [Tue, 26 Nov 2013 11:41:22 +0000 (12:41 +0100)]
Clarify uxsock logging

Socket creation might fail on various stages, so print out a
proper logging message.

Signed-off-by: Hannes Reinecke <hare@suse.de>
6 years agomove the 'update_multipath_strings()' function up so that mpp->mpe can be assigned...
Xinghai Yu [Thu, 28 Nov 2013 02:12:24 +0000 (10:12 +0800)]
move the 'update_multipath_strings()' function up so that mpp->mpe can be assigned in some situation

This patch is correcting a bug about multipath's no_path_retry attribute.
When the multipath was just created by user cmd 'multipath', the DM driver
doesn't know the wwid of this dm device, so the 'set_multipath_wwid()'
function can't get the wwid for mpp at this time, then the mpp->mpe can't befind and assigned by wwid also. In that case, the following function
'set_no_path_retry()' can't able to use the value provided by mpp->mpe
which values were getted from file '/etc/multipath.conf'. But the fuction
'update_multipath_strings()' can assign the wwid to mpp from its path, so
move this function up to solve this problem.

Bug reproduce steps:
1.First, by setting 'no_path_retry fail', we have features='0'.
[root@localhost multipath-tools]# cat /etc/multipath.conf
...
multipath {
wwid 36000b5d0006a0000006a14e7000b0000
alias                   yellow
path_grouping_policy    failover
no_path_retry fail
}
...
[root@localhost multipath-tools]# multipath -ll
yellow (36000b5d0006a0000006a14e7000b0000) dm-0
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=130 status=active
| `- 0:0:0:0 sda 8:0    active ready running
`-+- policy='service-time 0' prio=130 status=enabled
  `- 1:0:0:0 sdc 8:32   active ready running
[root@localhost multipath-tools]#

2.Second, we flush all multipath device maps.
[root@localhost multipath-tools]# multipath -F
[root@localhost multipath-tools]#

3.Third, create all multipath device maps.
[root@localhost multipath-tools]# multipath
create: yellow (36000b5d0006a0000006a14e7000b0000) undef
size=50G features='0' hwhandler='0' wp=undef
|-+- policy='service-time 0' prio=130 status=undef
| `- 0:0:0:0 sda 8:0    undef ready running
`-+- policy='service-time 0' prio=130 status=undef
  `- 1:0:0:0 sdc 8:32   undef ready running
[root@localhost multipath-tools]#

4.The end, we found "features='1 queue_if_no_path'" and this is not what we
expect, it's a bug. After applying this patch, it can be corrected.
[root@localhost multipath-tools]# multipath -ll
yellow (36000b5d0006a0000006a14e7000b0000) dm-0
size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=130 status=active
| `- 0:0:0:0 sda 8:0    active ready running
`-+- policy='service-time 0' prio=130 status=enabled
  `- 1:0:0:0 sdc 8:32   active ready running
[root@localhost multipath-tools]#

Signed-off-by: Xinghai Yu <yuxinghai@cn.fujitsu.com>
7 years agolibmultipath: do not stall on recv_packet()
Hannes Reinecke [Fri, 15 Nov 2013 10:29:36 +0000 (11:29 +0100)]
libmultipath: do not stall on recv_packet()

The CLI socket might have been closed or the daemon might have
been terminated by systemd without closing the CLI socket.
Hence we need to poll the socket if further data is avalailable,
otherwise the read() call will hang.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agolibmultipath: return error numbers from sysfs_get_XXX
Hannes Reinecke [Fri, 15 Nov 2013 10:29:35 +0000 (11:29 +0100)]
libmultipath: return error numbers from sysfs_get_XXX

Instead of returning hand-crafted error values from sysfs_get_XXX
functions we should be using the standard error numbers.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agolibmultipath: fixup strlcpy
Hannes Reinecke [Fri, 15 Nov 2013 10:29:34 +0000 (11:29 +0100)]
libmultipath: fixup strlcpy

The final comparison was wrong; 'size' was never decreased.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoSet priority to '0' for PATH_BLOCKED or PATH_DOWN
Hannes Reinecke [Fri, 15 Nov 2013 10:29:33 +0000 (11:29 +0100)]
Set priority to '0' for PATH_BLOCKED or PATH_DOWN

When a path is down or blocked we need to initialize the priority
to '0'. Otherwise multipathd will discard the maps during reload
and fail to start if all paths are temporarily down.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoImprove logging for orphan_path()
Hannes Reinecke [Fri, 15 Nov 2013 10:29:32 +0000 (11:29 +0100)]
Improve logging for orphan_path()

orphan_path() is called from various sections, so add a
description to the call to aid debugging.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agolibmultipath/checkers/rdac.c: Use RTPG data in RDAC checker
Sean Stewart [Thu, 14 Nov 2013 00:09:29 +0000 (18:09 -0600)]
libmultipath/checkers/rdac.c: Use RTPG data in RDAC checker

Make the RDAC checker utilize the RTPG data given in the 0xC9 inquiry
page to make intelligent decisions about path status, and print more
descriptive path down messages.

Signed-off-by: Sean Stewart <sean.stewart@netapp.com>
7 years agolibmultipath/discovery.c: Fix a condition that will lead to a table reload during...
Sean Stewart [Thu, 14 Nov 2013 00:06:34 +0000 (18:06 -0600)]
libmultipath/discovery.c: Fix a condition that will lead to a table reload during every check.

If path_offline says PATH_UP and the checker says PATH_DOWN.  In
this case, pathinfo gets called twice, and the first call, from
update_prio sets it to -1 because the checker says it's down.  On the
second pathinfo call, from update_path_groups, it will call the
prioritizer based on the fact that the prio is -1. This leads to a flip
flop of the priority value and a reload on every check.

Signed-off-by: Sean Stewart <sean.stewart@netapp.com>
7 years agomultipathd: revert mpp size update if map update fails
Mike Christie [Fri, 25 Oct 2013 00:50:55 +0000 (19:50 -0500)]
multipathd: revert mpp size update if map update fails

If updating the dm device in the kernel fails we cannot leave
the mpp size updated, because if we correct the problem and
try to resize later multipathd prevents resizing the device
of the size has not changed.

I hit this when all paths to a dm-multipath device where not
yet updated but multipathd resize was run.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
7 years agomultipath: Allow whitespace before or after exit and quit command in multipathd
Zheng Huai Cheng [Mon, 14 Oct 2013 19:44:25 +0000 (21:44 +0200)]
multipath: Allow whitespace before or after exit and quit command in multipathd

This problem is reported by IBM internal test team.

In multipathd interactive mode, commands allow preceding and appending
whitespace while quit and exit do not. If we have whitespace before or
after quit|exit command, multipathd won't recognize exit and quit
command and exit the interactive mode.

This patch provide support whitespace in the quit and exit command.

Signed-off-by: Zheng Huai Cheng <zhenghch@linux.vnet.ibm.com>
7 years agoRemove again multipath_dir from multipath.conf.defaults
Christophe Varoqui [Wed, 17 Jul 2013 20:19:38 +0000 (22:19 +0200)]
Remove again multipath_dir from multipath.conf.defaults

Suggested, again, by Xose Vasquez Perez <xose.vazquez@gmail.com>

7 years agoSync the Netapp INF-01-00 defaults to multipath.conf.defaults
Christophe Varoqui [Wed, 17 Jul 2013 20:16:35 +0000 (22:16 +0200)]
Sync the Netapp INF-01-00 defaults to multipath.conf.defaults

Suggested by Xose Vasquez Perez <xose.vazquez@gmail.com>

7 years agoBuild fixes for mpath_persist and multipath
Christophe Varoqui [Tue, 16 Jul 2013 20:31:40 +0000 (22:31 +0200)]
Build fixes for mpath_persist and multipath

libudev must be linked there too.

7 years agoUpdate multipath.conf.defaults
Hannes Reinecke [Tue, 16 Jul 2013 07:13:09 +0000 (09:13 +0200)]
Update multipath.conf.defaults

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: fix setting of fast_io_fail_tmo
Petr Uzel [Tue, 16 Jul 2013 07:13:19 +0000 (09:13 +0200)]
multipath: fix setting of fast_io_fail_tmo

Setting fast_io_fail_tmo to the same value as dev_loss_tmo is
not allowed by the kernel. Increase dev_loss_tmo by 1 in such
cases to make it strictly greated than fast_io_fail_tmo.

Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
7 years agomultipathd: valgrind fixes
Hannes Reinecke [Tue, 16 Jul 2013 07:13:16 +0000 (09:13 +0200)]
multipathd: valgrind fixes

valgrind complained about uninitialized memory.
As usual, valgrind was right, although the memory never was
actually referenced.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipathd: increase stacksize for uevent listener
Hannes Reinecke [Tue, 16 Jul 2013 07:13:17 +0000 (09:13 +0200)]
multipathd: increase stacksize for uevent listener

libudev has quite some heavy stack usage, so the default stack size
is not enough here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: reference the udev context when starting event queue
Hannes Reinecke [Tue, 16 Jul 2013 07:13:15 +0000 (09:13 +0200)]
multipath: reference the udev context when starting event queue

The uevent listener is running asynchronously, so it might still
be active and receiving events when the main thread is already
shut down. So it need to take a separate reference to the udev
context to avoid the context becoming invalid while the listener
is running.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoSpecify checker_timeout in seconds
Hannes Reinecke [Tue, 16 Jul 2013 07:13:18 +0000 (09:13 +0200)]
Specify checker_timeout in seconds

Commit 8d3f07da changed the internal value for checker_timeout
to be in milliseconds, which wasn't reflected in the tur checker.
So better scale it back to seconds, and change the callers to
scale it to milliseconds where appropriate.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoDo not print error when rport is blocked
Hannes Reinecke [Tue, 16 Jul 2013 07:13:14 +0000 (09:13 +0200)]
Do not print error when rport is blocked

When an rport is blocked any write to the dev_loss_tmo
attribute will fail with EBUSY. But that's perfectly
normal and nothing to worry about, so decrease the
logging priority for these cases.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: reset queue_if_no_path if flush failed
Hannes Reinecke [Tue, 16 Jul 2013 07:13:20 +0000 (09:13 +0200)]
multipath: reset queue_if_no_path if flush failed

When flushing a map failed the 'queue_if_no_path' setting is
getting lost.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath.conf.annotated: remove 'udev_dir'
Hannes Reinecke [Tue, 16 Jul 2013 07:13:12 +0000 (09:13 +0200)]
multipath.conf.annotated: remove 'udev_dir'

Deprecated, remove it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: Implement 'property' blacklist
Hannes Reinecke [Tue, 16 Jul 2013 07:13:13 +0000 (09:13 +0200)]
multipath: Implement 'property' blacklist

Multipath can only handle device properly which support the VPD
page 0x83. Originally this was ensured by 'scsi_id', which would
not present an ID_SERIAL value in these cases.
With the move to udev 'ID_SERIAL' is now always present, so
multipath would try to attach to _all_ SCSI devices.
This patch implements an 'property' blacklist, which allows to
blacklist a device based on the existence of udev properties.
Any device not providing the udev property from the whitelist
will be ignored.
The default whitelist is set to '(ID_WWN|ID_SCSI_VPD)'.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agolibmultipath: read path state directly from sysfs
Hannes Reinecke [Tue, 16 Jul 2013 07:13:21 +0000 (09:13 +0200)]
libmultipath: read path state directly from sysfs

The 'state' attribute of a SCSI device might change without
generating any udev events, so we need to read it directly
from sysfs.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath.conf.annotated: Document rr_min_io_rq
Hannes Reinecke [Tue, 16 Jul 2013 07:13:06 +0000 (09:13 +0200)]
multipath.conf.annotated: Document rr_min_io_rq

rr_min_io_rq wasn't documented in multipath.conf.annotated.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoCorrectly set pgfailback
Hannes Reinecke [Tue, 16 Jul 2013 07:13:10 +0000 (09:13 +0200)]
Correctly set pgfailback

Something weird happened to pgfailback; no default was assigned
when loading the configuration, but then it got set (wrongly)
to the default value when printing the configuration.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoCorrectly set max_fds in case of failure
Hannes Reinecke [Tue, 16 Jul 2013 07:13:08 +0000 (09:13 +0200)]
Correctly set max_fds in case of failure

If we fail to get the system limit for max_fds we should assume
a safe value of 4096.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoCorrectly print out 'max' for max_fds
Hannes Reinecke [Tue, 16 Jul 2013 07:13:07 +0000 (09:13 +0200)]
Correctly print out 'max' for max_fds

If the value specified for 'max_fds' is the system-wide limit
we should be printing out 'max', and not that value.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath.conf.5: clarify 'no_path_retry' default setting
Hannes Reinecke [Tue, 16 Jul 2013 07:13:11 +0000 (09:13 +0200)]
multipath.conf.5: clarify 'no_path_retry' default setting

The default setting for 'no_path_retry' is 'unset', so we should
be stating this in the man page.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agokpartx: support disk with non-512B sectors
Petr Uzel [Tue, 16 Jul 2013 07:13:01 +0000 (09:13 +0200)]
kpartx: support disk with non-512B sectors

libdevmapper expects sector size to be recalculated to 512B, so we need
to teach kpartx to do so if the underlying DM device has different
sector size (for GPT and msods partition tables).

Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
7 years agoDeprecate pg_timeout
Hannes Reinecke [Tue, 16 Jul 2013 07:12:58 +0000 (09:12 +0200)]
Deprecate pg_timeout

pg_timeout has been removed from dm-multipath, so deprecate it
from userspace, too.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoMinor fixes for priority handling
Hannes Reinecke [Tue, 16 Jul 2013 07:13:03 +0000 (09:13 +0200)]
Minor fixes for priority handling

When no prio handler was selected we should be setting the
priority to 'PRIO_UNDEF'.
Also fixup a typo in the logging message when selecting
the default prioritizer.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoRead directly from sysfs when checking the device size
Hannes Reinecke [Tue, 16 Jul 2013 07:13:05 +0000 (09:13 +0200)]
Read directly from sysfs when checking the device size

Device sizes might change, so we need to be reading from sysfs
directly to avoid udev still caching the old value.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoCheck return value from pathinfo()
Hannes Reinecke [Tue, 16 Jul 2013 07:13:04 +0000 (09:13 +0200)]
Check return value from pathinfo()

pathinfo() has a return value, which should be checked to catch
any abnormal behaviour.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: Add 'Datacore Virtual Disk' to internal hardware table
Hannes Reinecke [Tue, 16 Jul 2013 07:13:02 +0000 (09:13 +0200)]
multipath: Add 'Datacore Virtual Disk' to internal hardware table

Add 'Datacore Virtual Disk' to internal hardware table.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: Deprecate 'getuid' configuration variable
Hannes Reinecke [Tue, 16 Jul 2013 07:13:00 +0000 (09:13 +0200)]
multipath: Deprecate 'getuid' configuration variable

Older versions of multipath-tools used the 'getuid_callout'
configuration variable to generate the WWID. So for compatibility
we should be accepting existing configurations, but mark the
variable as 'deprecated'.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: Increase dev_loss_tmo prior to fast_io_fail
Hannes Reinecke [Tue, 16 Jul 2013 07:12:55 +0000 (09:12 +0200)]
multipath: Increase dev_loss_tmo prior to fast_io_fail

There are several constraints when setting fast_io_fail and
dev_loss_tmo.
dev_loss_tmo will be capped automatically when fast_io_fail is
not set. And fast_io_fail can not be raised beyond dev_loss_tmo.

So to increase dev_loss_tmo and fast_io_fail we first need
to increase dev_loss_tmo to the given fast_io_fail
setting, then set fast_io_fail, and then set dev_loss_tmo
to the final setting.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agokpartx: create correct symlinks for PATH_FAILED events
Hannes Reinecke [Tue, 16 Jul 2013 07:12:59 +0000 (09:12 +0200)]
kpartx: create correct symlinks for PATH_FAILED events

The kernel is sending out PATH_FAILED/PATH_REINSTATED events,
upon each of which we need to regenerate the existing symlinks.
However, for an all-paths-down scenario we cannot read from
the disk, so we cannot execute blkid or kpartx.
For these cases we need to import the existing data from the
udev database.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agolibmultipath: Implement PATH_TIMEOUT
Hannes Reinecke [Tue, 16 Jul 2013 07:12:57 +0000 (09:12 +0200)]
libmultipath: Implement PATH_TIMEOUT

The tur checker might run into a timeout, eg when a command is
sent but the checker hasn't been able to receive a reply in time.
Use a specific 'PATH_TIMEOUT' state for these cases.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoalua: Do not add preferred path priority for active/optimized
Hannes Reinecke [Tue, 16 Jul 2013 07:12:54 +0000 (09:12 +0200)]
alua: Do not add preferred path priority for active/optimized

When a path is in active/optimized we should disregard the
'preferred path' bit when calculating the priority.
Otherwise we'll end up with having two different priorities
(one for 'active/optimized (preferred)' and one for
'active/optimized (non-preferred)').
Which will result in two different path groups and a
sub-optimal path usage.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agolibmultipath: return PATH_DOWN for quiesced paths
Hannes Reinecke [Tue, 16 Jul 2013 07:12:56 +0000 (09:12 +0200)]
libmultipath: return PATH_DOWN for quiesced paths

When a SCSI device is quiesced we cannot send any I/O requests to
it, so we should be returning 'PATH_DOWN' here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoDocument 'infinity' as possible value for dev_loss_tmo
Hannes Reinecke [Tue, 16 Jul 2013 07:12:53 +0000 (09:12 +0200)]
Document 'infinity' as possible value for dev_loss_tmo

infinity is a valid value for dev_loss_tmo, so we should be
documenting it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipath: bind lifetime of udev context to main thread
Hannes Reinecke [Tue, 16 Jul 2013 07:12:52 +0000 (09:12 +0200)]
multipath: bind lifetime of udev context to main thread

We have to tie the lifetime of the udev context to the thread
or program. The current approach by creating it on config_load()
will invalidate the context during reconfiguration, thereby
causing all still existent objects to refer to an invalid pointer.
And resulting in a nice crash.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agomultipathd/cli_handlers cli_resize : check pp and pgp before calling them to avoid...
Eli Qiao [Mon, 17 Jun 2013 03:53:22 +0000 (11:53 +0800)]
multipathd/cli_handlers cli_resize : check pp and pgp before calling them to avoid multipathd core dump.

Signed-off-by: Eli Qiao <taget@linux.vnet.ibm.com>
7 years agosimplify multipath signal handlers
Benjamin Marzinski [Wed, 8 May 2013 18:36:04 +0000 (13:36 -0500)]
simplify multipath signal handlers

This patch changes how multipath's signal handlers work.  Instead of
having sighup and sigusr1 acquire locks and call functions directly,
they now both simply set atomic variables.  These two signals are
blocked in child(), and all other threads inherit this sigmask. The
only place in all the multipath code that doesn't block these signals
is now the ppoll() call by the uxlsnr thread.  When it is interrupted
by a signal, the uxlsnr thread does the actual processing work.

Instead of sigend using mutex locks and condition variables to tell
the child() function to exit, it now uses a signal_handler safe
semaphore that child() waits on and exit_daemon() increments.

This patch also switches all the sigprocmask() calls to
pthread_sigmask()

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoAdd missing includes for remember_wwid
Benjamin Marzinski [Tue, 7 May 2013 22:12:37 +0000 (17:12 -0500)]
Add missing includes for remember_wwid

My previous commit (0245b3ac6e34dee1ab039bba71806bc35c286ab8) caused a
warning on compile, since it didn't include the wwids.h file in
main.c

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoCorrectly ignore empty prio names
Hannes Reinecke [Wed, 8 May 2013 09:13:43 +0000 (11:13 +0200)]
Correctly ignore empty prio names

This is a partial revert of commit
'Stop annoying prio_lookup warning messages',
as that patch would only fix the 'prio_put' case.
However, as the prio name might be empty even in
in prio_get() we should rather fix this in
prio_lookup() and handle both cases.

Signed-off-by: Hannes Reinecke <hare@suse.de>
7 years agoUse mapname to choose kpartx delimiter
Benjamin Marzinski [Thu, 2 May 2013 21:46:37 +0000 (16:46 -0500)]
Use mapname to choose kpartx delimiter

When kpartx creates partition devices, it uses the mapname as the base
for the partition device names. However when choosing the delimiter,
it uses the device name passed in.  So if kpartx is called on /dev/dm-X
it will always add a 'p', even if the mapname ends in a letter.  This
patch fixes that by setting the delimiter based on the mapname.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agomake multipathd disable queue_without_daemon by default
Benjamin Marzinski [Thu, 2 May 2013 21:46:36 +0000 (16:46 -0500)]
make multipathd disable queue_without_daemon by default

Once multipathd stops, there is nothing to restore access to failed
paths. Any multipath devices with no active paths that are set to
queue_if_no_path will never stop queueing, even if they were supposed
to stop after a number of retries.  This can cause shutdown to hang.
So, this patch turns off queueing by default when multipathd is stopped.
It also adds two multipathd interactive commands "forcequeueing daemon"
and "restorequeueing daemon", which override and reset this behavior,
respectively.

Unfortunately this isn't a perfect solution.  Ideally, when restarting
multipathd, you would first call "forcequeueing daemon", no make sure
that any devices that are queueing without paths continue to do so while
you are restarting the daemon. However there is no way to do this in
systemd as there was in Upstart. There is a languishing RFE that I filed
for an ExecRestartPre option in systemd. But for most users, the only
time when they need to restart multipathd is when upgrading the package,
so forcing queueing can be dealt with there.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoStop annoying prio_lookup warning messages
Benjamin Marzinski [Thu, 2 May 2013 21:46:35 +0000 (16:46 -0500)]
Stop annoying prio_lookup warning messages

Multipath shouldn't try to look up its prioritizer if it doesn't have
one. Doing so just causes annoying warning messages.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoMake multipathd deal better with uninitialized paths
Benjamin Marzinski [Thu, 2 May 2013 21:46:34 +0000 (16:46 -0500)]
Make multipathd deal better with uninitialized paths

If multipathd cannot get all the necessary information from a path in
pathinfo, it clears the path's wwid, and adds it to the pathvec without
being initialized.  However, it never tries to reinitialize it later.
This can cause problems at bootup if multipathd is started at around
the same time as some path devices are discovered. multipathd may try
to initalize them in configure() before they are all the way set up.
After the paths are completely set up, multipathd will get a uevent for
them, but it won't try to reinitialize them. This patch adds
reinitialization code to uev_add_path().  Also, since getting the path
uid now just involves reading an attribute set by udev, there's no
reason no to try it for paths that are currently down.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoMake set_multipath_wwid actually do something
Benjamin Marzinski [Thu, 2 May 2013 21:46:33 +0000 (16:46 -0500)]
Make set_multipath_wwid actually do something

mpp->wwid is a character array in the multipath struction, not a pointer,
so it is never NULL. multipath needs to check if the string is empty
instead.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoFix max path checker timing
Benjamin Marzinski [Fri, 3 May 2013 17:59:49 +0000 (12:59 -0500)]
Fix max path checker timing

Due to some code being placed inside the wrong block, the number of
seconds to wait between path checks (pp->tick), was only getting set to
the path's individual check interval if that wasn't equal to the max
check interval.  Otherwise it was using the default for a failed path.
This patch makes sure that pp->ticks always always gets set correctly
for active paths.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoadd wwids file cleanup options
Benjamin Marzinski [Thu, 2 May 2013 21:46:31 +0000 (16:46 -0500)]
add wwids file cleanup options

This patch adds the "-w <device>" and "-W" options to multipath. They
allow users to either remove a specified device from the wwids file, or
reset the wwids file to only include the wwids for their current
multipath devices.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoAdd existing multipath devices to wwids file on
Benjamin Marzinski [Thu, 2 May 2013 21:46:30 +0000 (16:46 -0500)]
Add existing multipath devices to wwids file on

When multipathd started up, it didn't add any existing devices to the
wwids file.  Because of this, devices that were always set up in the
initramfs were not counted as valid multipath devices, and checking
if one of their paths was a multipath path device gave the incorrect
answer.  This patch makes multipath add those devices when it does
its initial configuration on startup.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoAvoid race between ueventloop and uevqloop
Benjamin Marzinski [Thu, 2 May 2013 21:46:29 +0000 (16:46 -0500)]
Avoid race between ueventloop and uevqloop

ueventloop sets up uevq_lockp and uev_condp, which uevqloop uses.
If uevqloop accesses these structures before ueventloop has
initialized them, it will not wake up to process uevents. This patch
statically initializes these structures so they will always be
initialized. Also, since calling LIST_HEAD(uevq) initializes it,
there is no reason to call INIT_LIST_HEAD on it later.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoFix some socket issues
Benjamin Marzinski [Thu, 2 May 2013 21:46:28 +0000 (16:46 -0500)]
Fix some socket issues

multipath wasn't acutally using /org/kernel/linux/storage/multipathd
for its local socket because when it created and bound to that
socket, it didn't include the size of the structure in the length
it passed with the call.  The result was a trucnated name. Also,
mpathpersist wasn't updated to use the new socket name. This patch
fixes both.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoFix hardware entry matching code
Benjamin Marzinski [Thu, 2 May 2013 21:46:27 +0000 (16:46 -0500)]
Fix hardware entry matching code

When a user defined hardware table entry's identifiers exactly
match a built-in one's, the built-in one is removed, and the list
is rescaned.  However, the built-in entry is not freed, and on the
rescan, the first user defined entry is treated as a built-in
entry. This patch frees the built-in entry, and decrements the
number of built-in entries, so that the rescan works as expected.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoDon't print checker messages for ghost paths
Benjamin Marzinski [Thu, 2 May 2013 21:46:25 +0000 (16:46 -0500)]
Don't print checker messages for ghost paths

Since PATH_GHOST is not an unexpected state, we don't need to
keep printing out checker messages for these paths.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoFix print_multipath_topology for large outputs
Benjamin Marzinski [Thu, 2 May 2013 21:46:24 +0000 (16:46 -0500)]
Fix print_multipath_topology for large outputs

print_multipath_topology had a hard size limit. With a large number
of LUNs and a large number of paths, it was possible to go over
this limit, and have some of the output cut off.
print_multipath_topology now checks for this, and resizes the
buffer if necessary.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoMake kpartx correctly handle non-512 byte GPT
Benjamin Marzinski [Thu, 2 May 2013 21:46:23 +0000 (16:46 -0500)]
Make kpartx correctly handle non-512 byte GPT

The gpt code in kpartx correctly handled non-512 byte gpt
partitions right up until it was time to actually write out the
slice data. At that point it forgot to convert the logical block
address into a the proper slice offset. This patch fixes that.

Signed-off-by: Philipp Schmidt <philipp@ppc.in-berlin.de>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years agoMake kpartx advise modprobe instead of insmod
Benjamin Marzinski [Thu, 2 May 2013 21:46:22 +0000 (16:46 -0500)]
Make kpartx advise modprobe instead of insmod

Users usually want to use modprobe instead of insmod, since it
handles finding the correct version and dependencies.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
7 years ago[libmultipath] fix whitespace errors
Christophe Varoqui [Mon, 29 Apr 2013 20:45:03 +0000 (22:45 +0200)]
[libmultipath] fix whitespace errors

introduced by the 2 previous patches applied through git am.

7 years agoAdditional fixes for inconsistent quoting in snprint functions
Stewart, Sean [Tue, 23 Apr 2013 21:23:19 +0000 (21:23 +0000)]
Additional fixes for inconsistent quoting in snprint functions

This patch finishes the job from this commit: http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=cef43b6f910f740c0e2d38761f58c5ebedfb7585;hp=41b85341ca514a50d18c592996a2ecb43a81fa90
All attributes printing strings from their snprint functions should now be quotes.

Signed-off-by: Sean Stewart <Sean.Stewart@netapp.com>
7 years agoFix failback parameter parsing in conf
Stewart, Sean [Tue, 23 Apr 2013 21:21:18 +0000 (21:21 +0000)]
Fix failback parameter parsing in conf

This patch fixes a problem introduced in this commit: http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=cef43b6f910f740c0e2d38761f58c5ebedfb7585;hp=41b85341ca514a50d18c592996a2ecb43a81fa90
Currently, the string handler for failback on hw entries expects strings like "manual" to be quoted.  The buffer always strips quotes.
As a result, the keywords manual, immediate, and followover cannot be used to change a failback parameter through multipath.conf

Signed-off-by: Sean Stewart <Sean.Stewart@netapp.com>
7 years agoDocs: multipath.8: A little quick copyediting
Michael Witten [Tue, 23 Apr 2013 20:21:11 +0000 (20:21 +0000)]
Docs: multipath.8: A little quick copyediting

The following patch applies cleanly at least to the following commit:

  de4e708f82c5f5f1575fafefbceb3624600c3dac

To apply this patch, save this email to:

  /path/to/email

and then run the following from the repository:

  $ git am --scissors /path/to/email

8<-----------8<-----------8<-----------8<-----------8<-----------8<-----------

Date: Mon, 22 Apr 2013 21:31:36 +0000

The grammar was arguably incorrect (or at asymmetrical):

  multipath is used to detect... and coalesces them

Also, I found the description for the `device' parameter
to be a little rough.

Signed-off-by: Michael Witten <mfwitten@gmail.com>
7 years ago[libmultipath] Minimize noise with snapshots in emc_clariion checker
Jerome Levy [Sat, 27 Apr 2013 07:54:55 +0000 (09:54 +0200)]
[libmultipath] Minimize noise with snapshots in emc_clariion checker

Patch to stop emc_clariion_checker from logging messages when probes
to snapshot LUNs occur. Notification is still available if logging is
turned up (see condlog()) but normal probing of snapshots will no
longer produce status messages. Path functionality on snapshot probes
is unchanged.

Signed-off-by: Jerry Levy <jerome.levy@emc.com>
7 years agoDocs: multipath.conf.annotated: Document the `no_partitions' feature
Michael Witten [Thu, 25 Apr 2013 03:40:25 +0000 (03:40 +0000)]
Docs: multipath.conf.annotated: Document the `no_partitions' feature

The following commit added the `no_partitions' feature:

  095942eb14d735af80aa7d1d9fd8d3d53dc0db70
  Add 'no_partitions' feature
  Date: Wed Jul 23 14:15:12 2008 +0200

Signed-off-by: Michael Witten <mfwitten@gmail.com>
7 years agoDocs: multipath.conf.annotated: Update `path_selector' description
Michael Witten [Thu, 25 Apr 2013 03:18:53 +0000 (03:18 +0000)]
Docs: multipath.conf.annotated: Update `path_selector' description

In the following commit:

  c015b128103e7a6426d124a38cd679a181573b88
  multipath: change default path_selector to
  Date: Sat Jan 12 00:04:40 2013 -0600

the default for the `path_selector' attribute is changed
from "round-robin 0" to "service-time 0"; this new commit
reflects that change in `multipath.conf.annotated'.

Also, this commit adds documentation about the other
possible values for `path_selector'.

Signed-off-by: Michael Witten <mfwitten@gmail.com>
7 years agoDocs: multipath.conf.5: path_grouping_policy default is `failover', not `multibus'
Michael Witten [Thu, 25 Apr 2013 01:52:40 +0000 (01:52 +0000)]
Docs: multipath.conf.5: path_grouping_policy default is `failover', not `multibus'

$ git grep -n DEFAULT_PGPOLICY -- libmultipath/defaults.h
libmultipath/defaults.h:10:#define DEFAULT_PGPOLICY       FAILOVER

Signed-off-by: Michael Witten <mfwitten@gmail.com>