Hannes Reinecke [Fri, 13 Dec 2013 12:12:42 +0000 (13:12 +0100)]
multipath: do not call tur in sync mode if pthread_cancel fails
When pthread_cancel fails the thread is stuck, most likely
during I/O submission. So it would be pointless to call the
tur checker in sync mode here, as this would be stuck, too.
Hence we should rather return 'PATH_TIMEOUT' and hope the
situation resolves itself over time.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 12:12:41 +0000 (13:12 +0100)]
libmultipath: proactively remove path
When path_offline() detects a removed path we really do not need
to wait for any uevent to arrive, but can remove the path
straightaway.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 12:12:40 +0000 (13:12 +0100)]
multipath: do not print 'path is up' for removed paths
When a path is removed the previous checker message is still
kept in the checker context, and will be printed upon each
check. This causes multipath to print out
'path is up'
even though it already has been removed from sysfs.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 12:12:39 +0000 (13:12 +0100)]
libmultipath: Fix typo in retain_attached_hw_handler
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 12:12:38 +0000 (13:12 +0100)]
Document 'wwids_file' and 'reservation_key'
Add documentation for 'wwids_file' and 'reservation_key' to
multipath.conf.annotated.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 12:12:37 +0000 (13:12 +0100)]
Remove trailing spaces from sysfs attributes
Some sysfs attributes may contain trailing spaces, which only
serve to confuse matters. So strip them before continuing.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 09:11:19 +0000 (10:11 +0100)]
multipathd: Implement systemd watchdog integration
In the past there have been several instances where multipathd
would hang with the checkerloop as some path checker might not
be able to return in time.
This patch now activates the watchdog feature from systemd
to shutdown (and possibly restart) multipathd in these
situations.
Due to a bug in systemd watchdog integration only works
correctly with later version (> 206), so watchdog integration
has been disabled per default.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 09:11:18 +0000 (10:11 +0100)]
Use correct systemd unit directory
The systemd unit directory has been moved to /usr/lib/systemd.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 13 Dec 2013 09:11:17 +0000 (10:11 +0100)]
multipathd: enable core dumps for systemd
Add 'LimitCORE' definition to the service file to enable core
dumps when running under systemd.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:10 +0000 (00:43 -0600)]
kpartx: autoload loop module on loop partition creation
Currently kpartx doesn't do anything to force the loop module to
autoload, so creating partitions over files fails if the loop module
isn't already loaded. This patch makes kpartx try to find the next
available loop device by ioctling /dev/loop-control, which will
autoload kpartx.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:09 +0000 (00:43 -0600)]
multipathd: allow /dev/<devnode> to be used for multipathd commands
Multipathd expects that path and map names used in its interactive
commands are sysfs names with no directory. However, users often try
used /dev names instead. This patch makes multipathd convert the /dev
names to sysfs names like the multipath command does.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:08 +0000 (00:43 -0600)]
libmultipath: blacklist blktap devices
Multipath can't run on top of blktap devices, so it should just ignore them.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:07 +0000 (00:43 -0600)]
kpartx: fix strict aliasing warning
Compiling with strict aliasing throws warnings about aliasing
volume_label_t to an array of unsigned ints. Adding __may_alias__
lets it know that we meant to do this.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:06 +0000 (00:43 -0600)]
multipathd: Don't touch the multipath device after setup_multipath fails
If setup_multipath fails, it removes the device. multipath always needs
to check its return value and not touch the device if setup_multipath
failed.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:05 +0000 (00:43 -0600)]
libmultipath: show default configurations
"multipathd show config" should show the current configuration, even if
the current values match the default values. Omitting the values when
they match the compiled in defaults just makes it harder for users to
see how multipath configured.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:04 +0000 (00:43 -0600)]
kpartx: Make kpartx able to create read-only loop devices
This just passes the read-only value into set_loop, and falls back
to read-only mapping on EACCESS to handle immutable files.
Signed-off-by: Till Maas <opensource@till.name>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:03 +0000 (00:43 -0600)]
libmultipath: try to deal with bindings file hand editting
Unfortunately, instead of adding aliases to /etc/multipath.conf like they
should, some users add aliases to the bindings file. If they add aliases
that look like user_friendly_names, like mpathfoo, they interfere with
the code to find the next available user_friendly_name (giving you
mpathfop as the next user_friendly_name in this case). This patch
will fix this by giving the next name available name after an unbroken
string of names starting with the smallest possible name. It order to keep
the choice quick, it won't handle situation where the list of names
starting at the smallest possible is not in order. There are a number
of ways to do a better job finding the smallest possible name, but this
will fix most of the cases I've seen, without slowing things down at all.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:01 +0000 (00:43 -0600)]
fix mutipath -W on empty wwids file
When multipath tries to open the wwids file and finds it empty or
missing, it writes the header to the file. When it tried to wipe
the wwids from an empty or missing file, it didn't seek back to the
start of the file after truncating it. This caused the the wwids
file to have a patch of zeroed bytes at the start. This patch fixes
this by always seeking back to the start of the file before rewriting
the header.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:43:00 +0000 (00:43 -0600)]
signal waiter thread to stop waiting on dm events
The ioctl syscall is not a pthread cancellation point. So, when
device-mapper is waiting on events, pthread_cancel won't cancel
the waiter thread. This patch makes the waiter threads handle
SIGUSR2, and has stop_waiter_thread() signal it to break out of
waiting on dm events. Unfortunately, there is still a possibility
of the signal arriving before the ioctl, and the waiter thread
still hanging until the devices are reloaded.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:59 +0000 (00:42 -0600)]
Only load a multipath device RO if RW load fails with EROFS
If a transient error keeps multipath from being able to reload a
multipath device read/write, it will retry the reload with the
device set to read-only. This can suddenly turn a read/write
multipath device read-only. If device-mapper cannot load the
device read/write because a path device can't be opened for writing,
it will return EROFS. This is the only time it makes sense to
reload the device read-only.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:58 +0000 (00:42 -0600)]
Don't detect prioritizers that don't work
The current method used by detect_prio was selecting the ALUA
prioritizer for devices that didn't have ALUA enabled. This patch
makes detect_prio go through all the steps to get a priority with the
ALUA prioritizer. If it is able to successfully get a priority, then
it selects the ALUA proritizer for the device.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:57 +0000 (00:42 -0600)]
Turn off user_friendly_names for netapp devices
Netapp has requested that even if user friendly names is enabled in the
defaults section, they would like it disabled for their devices.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:56 +0000 (00:42 -0600)]
Fix multipath rename from user_friendly_name to wwid
When multipath was selecting an alias for a device on reload, if it
didn't have an explicit alias, and user_friendly_names wasn't set,
multipath would use the existing alias, if one existed. This made it
impossible to turn off user_friendly_names, and then reconfigure to
change the device names back to wwids.
Instead, multipath should just use the wwid as an alias, if that's
what it's configured to do, regardless of the existing name.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Wed, 11 Dec 2013 06:42:55 +0000 (00:42 -0600)]
Add multipath path format wildcard
This adds a new format wildcard, 'm', to be used with
multipathd show paths format
It prints the multipath device associated with the path.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Mateusz Półrola [Wed, 11 Dec 2013 05:43:57 +0000 (06:43 +0100)]
[multipathd] Declaration of envp variable in child function is missing
Hannes Reinecke [Tue, 26 Nov 2013 11:41:30 +0000 (12:41 +0100)]
multipathd: measure path check time
Instrument code to measure path check time.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 26 Nov 2013 11:41:29 +0000 (12:41 +0100)]
multipathd: Read environment variables from systemd
When systemd adjusts 'OOMScoreAdjust' and 'LimitNOFILE'
we should take those settings and not try to adjust them
again on our side.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 26 Nov 2013 11:41:26 +0000 (12:41 +0100)]
multipathd: use sd_notify() to inform systemd
Implement sd_notify() to inform systemd about our internal state.
And we should be using the service type 'notify' so the systemd
doesn't try to flood multipathd if it's still in discovery.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 26 Nov 2013 11:41:25 +0000 (12:41 +0100)]
multipathd: switch to socket activation for systemd
multipathd already has a netlink socket for CLI commands, which
can be used for socket activation.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 26 Nov 2013 11:41:24 +0000 (12:41 +0100)]
multipathd: Add option '-s' to suppress timestamps
systemd prefixes any messages to stdout with a timestamp, so it's
quite pointless to do it ourself.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 26 Nov 2013 11:41:23 +0000 (12:41 +0100)]
Use system-provided regex implementation
There is zero value in carrying our own (old) regex implementation
around; we're far better off using the system-provided one.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 26 Nov 2013 11:41:22 +0000 (12:41 +0100)]
Clarify uxsock logging
Socket creation might fail on various stages, so print out a
proper logging message.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Xinghai Yu [Thu, 28 Nov 2013 02:12:24 +0000 (10:12 +0800)]
move the 'update_multipath_strings()' function up so that mpp->mpe can be assigned in some situation
This patch is correcting a bug about multipath's no_path_retry attribute.
When the multipath was just created by user cmd 'multipath', the DM driver
doesn't know the wwid of this dm device, so the 'set_multipath_wwid()'
function can't get the wwid for mpp at this time, then the mpp->mpe can't befind and assigned by wwid also. In that case, the following function
'set_no_path_retry()' can't able to use the value provided by mpp->mpe
which values were getted from file '/etc/multipath.conf'. But the fuction
'update_multipath_strings()' can assign the wwid to mpp from its path, so
move this function up to solve this problem.
Bug reproduce steps:
1.First, by setting 'no_path_retry fail', we have features='0'.
[root@localhost multipath-tools]# cat /etc/multipath.conf
...
multipath {
wwid
36000b5d0006a0000006a14e7000b0000
alias yellow
path_grouping_policy failover
no_path_retry fail
}
...
[root@localhost multipath-tools]# multipath -ll
yellow (
36000b5d0006a0000006a14e7000b0000) dm-0
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=130 status=active
| `- 0:0:0:0 sda 8:0 active ready running
`-+- policy='service-time 0' prio=130 status=enabled
`- 1:0:0:0 sdc 8:32 active ready running
[root@localhost multipath-tools]#
2.Second, we flush all multipath device maps.
[root@localhost multipath-tools]# multipath -F
[root@localhost multipath-tools]#
3.Third, create all multipath device maps.
[root@localhost multipath-tools]# multipath
create: yellow (
36000b5d0006a0000006a14e7000b0000) undef
size=50G features='0' hwhandler='0' wp=undef
|-+- policy='service-time 0' prio=130 status=undef
| `- 0:0:0:0 sda 8:0 undef ready running
`-+- policy='service-time 0' prio=130 status=undef
`- 1:0:0:0 sdc 8:32 undef ready running
[root@localhost multipath-tools]#
4.The end, we found "features='1 queue_if_no_path'" and this is not what we
expect, it's a bug. After applying this patch, it can be corrected.
[root@localhost multipath-tools]# multipath -ll
yellow (
36000b5d0006a0000006a14e7000b0000) dm-0
size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=130 status=active
| `- 0:0:0:0 sda 8:0 active ready running
`-+- policy='service-time 0' prio=130 status=enabled
`- 1:0:0:0 sdc 8:32 active ready running
[root@localhost multipath-tools]#
Signed-off-by: Xinghai Yu <yuxinghai@cn.fujitsu.com>
Hannes Reinecke [Fri, 15 Nov 2013 10:29:36 +0000 (11:29 +0100)]
libmultipath: do not stall on recv_packet()
The CLI socket might have been closed or the daemon might have
been terminated by systemd without closing the CLI socket.
Hence we need to poll the socket if further data is avalailable,
otherwise the read() call will hang.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 15 Nov 2013 10:29:35 +0000 (11:29 +0100)]
libmultipath: return error numbers from sysfs_get_XXX
Instead of returning hand-crafted error values from sysfs_get_XXX
functions we should be using the standard error numbers.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 15 Nov 2013 10:29:34 +0000 (11:29 +0100)]
libmultipath: fixup strlcpy
The final comparison was wrong; 'size' was never decreased.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 15 Nov 2013 10:29:33 +0000 (11:29 +0100)]
Set priority to '0' for PATH_BLOCKED or PATH_DOWN
When a path is down or blocked we need to initialize the priority
to '0'. Otherwise multipathd will discard the maps during reload
and fail to start if all paths are temporarily down.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Fri, 15 Nov 2013 10:29:32 +0000 (11:29 +0100)]
Improve logging for orphan_path()
orphan_path() is called from various sections, so add a
description to the call to aid debugging.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Sean Stewart [Thu, 14 Nov 2013 00:09:29 +0000 (18:09 -0600)]
libmultipath/checkers/rdac.c: Use RTPG data in RDAC checker
Make the RDAC checker utilize the RTPG data given in the 0xC9 inquiry
page to make intelligent decisions about path status, and print more
descriptive path down messages.
Signed-off-by: Sean Stewart <sean.stewart@netapp.com>
Sean Stewart [Thu, 14 Nov 2013 00:06:34 +0000 (18:06 -0600)]
libmultipath/discovery.c: Fix a condition that will lead to a table reload during every check.
If path_offline says PATH_UP and the checker says PATH_DOWN. In
this case, pathinfo gets called twice, and the first call, from
update_prio sets it to -1 because the checker says it's down. On the
second pathinfo call, from update_path_groups, it will call the
prioritizer based on the fact that the prio is -1. This leads to a flip
flop of the priority value and a reload on every check.
Signed-off-by: Sean Stewart <sean.stewart@netapp.com>
Mike Christie [Fri, 25 Oct 2013 00:50:55 +0000 (19:50 -0500)]
multipathd: revert mpp size update if map update fails
If updating the dm device in the kernel fails we cannot leave
the mpp size updated, because if we correct the problem and
try to resize later multipathd prevents resizing the device
of the size has not changed.
I hit this when all paths to a dm-multipath device where not
yet updated but multipathd resize was run.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Zheng Huai Cheng [Mon, 14 Oct 2013 19:44:25 +0000 (21:44 +0200)]
multipath: Allow whitespace before or after exit and quit command in multipathd
This problem is reported by IBM internal test team.
In multipathd interactive mode, commands allow preceding and appending
whitespace while quit and exit do not. If we have whitespace before or
after quit|exit command, multipathd won't recognize exit and quit
command and exit the interactive mode.
This patch provide support whitespace in the quit and exit command.
Signed-off-by: Zheng Huai Cheng <zhenghch@linux.vnet.ibm.com>
Christophe Varoqui [Wed, 17 Jul 2013 20:19:38 +0000 (22:19 +0200)]
Remove again multipath_dir from multipath.conf.defaults
Suggested, again, by Xose Vasquez Perez <xose.vazquez@gmail.com>
Christophe Varoqui [Wed, 17 Jul 2013 20:16:35 +0000 (22:16 +0200)]
Sync the Netapp INF-01-00 defaults to multipath.conf.defaults
Suggested by Xose Vasquez Perez <xose.vazquez@gmail.com>
Christophe Varoqui [Tue, 16 Jul 2013 20:31:40 +0000 (22:31 +0200)]
Build fixes for mpath_persist and multipath
libudev must be linked there too.
Hannes Reinecke [Tue, 16 Jul 2013 07:13:09 +0000 (09:13 +0200)]
Update multipath.conf.defaults
Signed-off-by: Hannes Reinecke <hare@suse.de>
Petr Uzel [Tue, 16 Jul 2013 07:13:19 +0000 (09:13 +0200)]
multipath: fix setting of fast_io_fail_tmo
Setting fast_io_fail_tmo to the same value as dev_loss_tmo is
not allowed by the kernel. Increase dev_loss_tmo by 1 in such
cases to make it strictly greated than fast_io_fail_tmo.
Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:16 +0000 (09:13 +0200)]
multipathd: valgrind fixes
valgrind complained about uninitialized memory.
As usual, valgrind was right, although the memory never was
actually referenced.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:17 +0000 (09:13 +0200)]
multipathd: increase stacksize for uevent listener
libudev has quite some heavy stack usage, so the default stack size
is not enough here.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:15 +0000 (09:13 +0200)]
multipath: reference the udev context when starting event queue
The uevent listener is running asynchronously, so it might still
be active and receiving events when the main thread is already
shut down. So it need to take a separate reference to the udev
context to avoid the context becoming invalid while the listener
is running.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:18 +0000 (09:13 +0200)]
Specify checker_timeout in seconds
Commit
8d3f07da changed the internal value for checker_timeout
to be in milliseconds, which wasn't reflected in the tur checker.
So better scale it back to seconds, and change the callers to
scale it to milliseconds where appropriate.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:14 +0000 (09:13 +0200)]
Do not print error when rport is blocked
When an rport is blocked any write to the dev_loss_tmo
attribute will fail with EBUSY. But that's perfectly
normal and nothing to worry about, so decrease the
logging priority for these cases.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:20 +0000 (09:13 +0200)]
multipath: reset queue_if_no_path if flush failed
When flushing a map failed the 'queue_if_no_path' setting is
getting lost.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:12 +0000 (09:13 +0200)]
multipath.conf.annotated: remove 'udev_dir'
Deprecated, remove it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:13 +0000 (09:13 +0200)]
multipath: Implement 'property' blacklist
Multipath can only handle device properly which support the VPD
page 0x83. Originally this was ensured by 'scsi_id', which would
not present an ID_SERIAL value in these cases.
With the move to udev 'ID_SERIAL' is now always present, so
multipath would try to attach to _all_ SCSI devices.
This patch implements an 'property' blacklist, which allows to
blacklist a device based on the existence of udev properties.
Any device not providing the udev property from the whitelist
will be ignored.
The default whitelist is set to '(ID_WWN|ID_SCSI_VPD)'.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:21 +0000 (09:13 +0200)]
libmultipath: read path state directly from sysfs
The 'state' attribute of a SCSI device might change without
generating any udev events, so we need to read it directly
from sysfs.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:06 +0000 (09:13 +0200)]
multipath.conf.annotated: Document rr_min_io_rq
rr_min_io_rq wasn't documented in multipath.conf.annotated.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:10 +0000 (09:13 +0200)]
Correctly set pgfailback
Something weird happened to pgfailback; no default was assigned
when loading the configuration, but then it got set (wrongly)
to the default value when printing the configuration.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:08 +0000 (09:13 +0200)]
Correctly set max_fds in case of failure
If we fail to get the system limit for max_fds we should assume
a safe value of 4096.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:07 +0000 (09:13 +0200)]
Correctly print out 'max' for max_fds
If the value specified for 'max_fds' is the system-wide limit
we should be printing out 'max', and not that value.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:11 +0000 (09:13 +0200)]
multipath.conf.5: clarify 'no_path_retry' default setting
The default setting for 'no_path_retry' is 'unset', so we should
be stating this in the man page.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Petr Uzel [Tue, 16 Jul 2013 07:13:01 +0000 (09:13 +0200)]
kpartx: support disk with non-512B sectors
libdevmapper expects sector size to be recalculated to 512B, so we need
to teach kpartx to do so if the underlying DM device has different
sector size (for GPT and msods partition tables).
Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:58 +0000 (09:12 +0200)]
Deprecate pg_timeout
pg_timeout has been removed from dm-multipath, so deprecate it
from userspace, too.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:03 +0000 (09:13 +0200)]
Minor fixes for priority handling
When no prio handler was selected we should be setting the
priority to 'PRIO_UNDEF'.
Also fixup a typo in the logging message when selecting
the default prioritizer.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:05 +0000 (09:13 +0200)]
Read directly from sysfs when checking the device size
Device sizes might change, so we need to be reading from sysfs
directly to avoid udev still caching the old value.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:04 +0000 (09:13 +0200)]
Check return value from pathinfo()
pathinfo() has a return value, which should be checked to catch
any abnormal behaviour.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:02 +0000 (09:13 +0200)]
multipath: Add 'Datacore Virtual Disk' to internal hardware table
Add 'Datacore Virtual Disk' to internal hardware table.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:13:00 +0000 (09:13 +0200)]
multipath: Deprecate 'getuid' configuration variable
Older versions of multipath-tools used the 'getuid_callout'
configuration variable to generate the WWID. So for compatibility
we should be accepting existing configurations, but mark the
variable as 'deprecated'.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:55 +0000 (09:12 +0200)]
multipath: Increase dev_loss_tmo prior to fast_io_fail
There are several constraints when setting fast_io_fail and
dev_loss_tmo.
dev_loss_tmo will be capped automatically when fast_io_fail is
not set. And fast_io_fail can not be raised beyond dev_loss_tmo.
So to increase dev_loss_tmo and fast_io_fail we first need
to increase dev_loss_tmo to the given fast_io_fail
setting, then set fast_io_fail, and then set dev_loss_tmo
to the final setting.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:59 +0000 (09:12 +0200)]
kpartx: create correct symlinks for PATH_FAILED events
The kernel is sending out PATH_FAILED/PATH_REINSTATED events,
upon each of which we need to regenerate the existing symlinks.
However, for an all-paths-down scenario we cannot read from
the disk, so we cannot execute blkid or kpartx.
For these cases we need to import the existing data from the
udev database.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:57 +0000 (09:12 +0200)]
libmultipath: Implement PATH_TIMEOUT
The tur checker might run into a timeout, eg when a command is
sent but the checker hasn't been able to receive a reply in time.
Use a specific 'PATH_TIMEOUT' state for these cases.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:54 +0000 (09:12 +0200)]
alua: Do not add preferred path priority for active/optimized
When a path is in active/optimized we should disregard the
'preferred path' bit when calculating the priority.
Otherwise we'll end up with having two different priorities
(one for 'active/optimized (preferred)' and one for
'active/optimized (non-preferred)').
Which will result in two different path groups and a
sub-optimal path usage.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:56 +0000 (09:12 +0200)]
libmultipath: return PATH_DOWN for quiesced paths
When a SCSI device is quiesced we cannot send any I/O requests to
it, so we should be returning 'PATH_DOWN' here.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:53 +0000 (09:12 +0200)]
Document 'infinity' as possible value for dev_loss_tmo
infinity is a valid value for dev_loss_tmo, so we should be
documenting it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Hannes Reinecke [Tue, 16 Jul 2013 07:12:52 +0000 (09:12 +0200)]
multipath: bind lifetime of udev context to main thread
We have to tie the lifetime of the udev context to the thread
or program. The current approach by creating it on config_load()
will invalidate the context during reconfiguration, thereby
causing all still existent objects to refer to an invalid pointer.
And resulting in a nice crash.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Eli Qiao [Mon, 17 Jun 2013 03:53:22 +0000 (11:53 +0800)]
multipathd/cli_handlers cli_resize : check pp and pgp before calling them to avoid multipathd core dump.
Signed-off-by: Eli Qiao <taget@linux.vnet.ibm.com>
Benjamin Marzinski [Wed, 8 May 2013 18:36:04 +0000 (13:36 -0500)]
simplify multipath signal handlers
This patch changes how multipath's signal handlers work. Instead of
having sighup and sigusr1 acquire locks and call functions directly,
they now both simply set atomic variables. These two signals are
blocked in child(), and all other threads inherit this sigmask. The
only place in all the multipath code that doesn't block these signals
is now the ppoll() call by the uxlsnr thread. When it is interrupted
by a signal, the uxlsnr thread does the actual processing work.
Instead of sigend using mutex locks and condition variables to tell
the child() function to exit, it now uses a signal_handler safe
semaphore that child() waits on and exit_daemon() increments.
This patch also switches all the sigprocmask() calls to
pthread_sigmask()
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Tue, 7 May 2013 22:12:37 +0000 (17:12 -0500)]
Add missing includes for remember_wwid
My previous commit (
0245b3ac6e34dee1ab039bba71806bc35c286ab8) caused a
warning on compile, since it didn't include the wwids.h file in
main.c
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Hannes Reinecke [Wed, 8 May 2013 09:13:43 +0000 (11:13 +0200)]
Correctly ignore empty prio names
This is a partial revert of commit
'Stop annoying prio_lookup warning messages',
as that patch would only fix the 'prio_put' case.
However, as the prio name might be empty even in
in prio_get() we should rather fix this in
prio_lookup() and handle both cases.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Benjamin Marzinski [Thu, 2 May 2013 21:46:37 +0000 (16:46 -0500)]
Use mapname to choose kpartx delimiter
When kpartx creates partition devices, it uses the mapname as the base
for the partition device names. However when choosing the delimiter,
it uses the device name passed in. So if kpartx is called on /dev/dm-X
it will always add a 'p', even if the mapname ends in a letter. This
patch fixes that by setting the delimiter based on the mapname.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:36 +0000 (16:46 -0500)]
make multipathd disable queue_without_daemon by default
Once multipathd stops, there is nothing to restore access to failed
paths. Any multipath devices with no active paths that are set to
queue_if_no_path will never stop queueing, even if they were supposed
to stop after a number of retries. This can cause shutdown to hang.
So, this patch turns off queueing by default when multipathd is stopped.
It also adds two multipathd interactive commands "forcequeueing daemon"
and "restorequeueing daemon", which override and reset this behavior,
respectively.
Unfortunately this isn't a perfect solution. Ideally, when restarting
multipathd, you would first call "forcequeueing daemon", no make sure
that any devices that are queueing without paths continue to do so while
you are restarting the daemon. However there is no way to do this in
systemd as there was in Upstart. There is a languishing RFE that I filed
for an ExecRestartPre option in systemd. But for most users, the only
time when they need to restart multipathd is when upgrading the package,
so forcing queueing can be dealt with there.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:35 +0000 (16:46 -0500)]
Stop annoying prio_lookup warning messages
Multipath shouldn't try to look up its prioritizer if it doesn't have
one. Doing so just causes annoying warning messages.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:34 +0000 (16:46 -0500)]
Make multipathd deal better with uninitialized paths
If multipathd cannot get all the necessary information from a path in
pathinfo, it clears the path's wwid, and adds it to the pathvec without
being initialized. However, it never tries to reinitialize it later.
This can cause problems at bootup if multipathd is started at around
the same time as some path devices are discovered. multipathd may try
to initalize them in configure() before they are all the way set up.
After the paths are completely set up, multipathd will get a uevent for
them, but it won't try to reinitialize them. This patch adds
reinitialization code to uev_add_path(). Also, since getting the path
uid now just involves reading an attribute set by udev, there's no
reason no to try it for paths that are currently down.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:33 +0000 (16:46 -0500)]
Make set_multipath_wwid actually do something
mpp->wwid is a character array in the multipath struction, not a pointer,
so it is never NULL. multipath needs to check if the string is empty
instead.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Fri, 3 May 2013 17:59:49 +0000 (12:59 -0500)]
Fix max path checker timing
Due to some code being placed inside the wrong block, the number of
seconds to wait between path checks (pp->tick), was only getting set to
the path's individual check interval if that wasn't equal to the max
check interval. Otherwise it was using the default for a failed path.
This patch makes sure that pp->ticks always always gets set correctly
for active paths.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:31 +0000 (16:46 -0500)]
add wwids file cleanup options
This patch adds the "-w <device>" and "-W" options to multipath. They
allow users to either remove a specified device from the wwids file, or
reset the wwids file to only include the wwids for their current
multipath devices.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:30 +0000 (16:46 -0500)]
Add existing multipath devices to wwids file on
When multipathd started up, it didn't add any existing devices to the
wwids file. Because of this, devices that were always set up in the
initramfs were not counted as valid multipath devices, and checking
if one of their paths was a multipath path device gave the incorrect
answer. This patch makes multipath add those devices when it does
its initial configuration on startup.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:29 +0000 (16:46 -0500)]
Avoid race between ueventloop and uevqloop
ueventloop sets up uevq_lockp and uev_condp, which uevqloop uses.
If uevqloop accesses these structures before ueventloop has
initialized them, it will not wake up to process uevents. This patch
statically initializes these structures so they will always be
initialized. Also, since calling LIST_HEAD(uevq) initializes it,
there is no reason to call INIT_LIST_HEAD on it later.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:28 +0000 (16:46 -0500)]
Fix some socket issues
multipath wasn't acutally using /org/kernel/linux/storage/multipathd
for its local socket because when it created and bound to that
socket, it didn't include the size of the structure in the length
it passed with the call. The result was a trucnated name. Also,
mpathpersist wasn't updated to use the new socket name. This patch
fixes both.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:27 +0000 (16:46 -0500)]
Fix hardware entry matching code
When a user defined hardware table entry's identifiers exactly
match a built-in one's, the built-in one is removed, and the list
is rescaned. However, the built-in entry is not freed, and on the
rescan, the first user defined entry is treated as a built-in
entry. This patch frees the built-in entry, and decrements the
number of built-in entries, so that the rescan works as expected.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:25 +0000 (16:46 -0500)]
Don't print checker messages for ghost paths
Since PATH_GHOST is not an unexpected state, we don't need to
keep printing out checker messages for these paths.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:24 +0000 (16:46 -0500)]
Fix print_multipath_topology for large outputs
print_multipath_topology had a hard size limit. With a large number
of LUNs and a large number of paths, it was possible to go over
this limit, and have some of the output cut off.
print_multipath_topology now checks for this, and resizes the
buffer if necessary.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:23 +0000 (16:46 -0500)]
Make kpartx correctly handle non-512 byte GPT
The gpt code in kpartx correctly handled non-512 byte gpt
partitions right up until it was time to actually write out the
slice data. At that point it forgot to convert the logical block
address into a the proper slice offset. This patch fixes that.
Signed-off-by: Philipp Schmidt <philipp@ppc.in-berlin.de>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin Marzinski [Thu, 2 May 2013 21:46:22 +0000 (16:46 -0500)]
Make kpartx advise modprobe instead of insmod
Users usually want to use modprobe instead of insmod, since it
handles finding the correct version and dependencies.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Christophe Varoqui [Mon, 29 Apr 2013 20:45:03 +0000 (22:45 +0200)]
[libmultipath] fix whitespace errors
introduced by the 2 previous patches applied through git am.
Stewart, Sean [Tue, 23 Apr 2013 21:23:19 +0000 (21:23 +0000)]
Additional fixes for inconsistent quoting in snprint functions
This patch finishes the job from this commit: http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=
cef43b6f910f740c0e2d38761f58c5ebedfb7585;hp=
41b85341ca514a50d18c592996a2ecb43a81fa90
All attributes printing strings from their snprint functions should now be quotes.
Signed-off-by: Sean Stewart <Sean.Stewart@netapp.com>
Stewart, Sean [Tue, 23 Apr 2013 21:21:18 +0000 (21:21 +0000)]
Fix failback parameter parsing in conf
This patch fixes a problem introduced in this commit: http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=
cef43b6f910f740c0e2d38761f58c5ebedfb7585;hp=
41b85341ca514a50d18c592996a2ecb43a81fa90
Currently, the string handler for failback on hw entries expects strings like "manual" to be quoted. The buffer always strips quotes.
As a result, the keywords manual, immediate, and followover cannot be used to change a failback parameter through multipath.conf
Signed-off-by: Sean Stewart <Sean.Stewart@netapp.com>
Michael Witten [Tue, 23 Apr 2013 20:21:11 +0000 (20:21 +0000)]
Docs: multipath.8: A little quick copyediting
The following patch applies cleanly at least to the following commit:
de4e708f82c5f5f1575fafefbceb3624600c3dac
To apply this patch, save this email to:
/path/to/email
and then run the following from the repository:
$ git am --scissors /path/to/email
8<-----------8<-----------8<-----------8<-----------8<-----------8<-----------
Date: Mon, 22 Apr 2013 21:31:36 +0000
The grammar was arguably incorrect (or at asymmetrical):
multipath is used to detect... and coalesces them
Also, I found the description for the `device' parameter
to be a little rough.
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Jerome Levy [Sat, 27 Apr 2013 07:54:55 +0000 (09:54 +0200)]
[libmultipath] Minimize noise with snapshots in emc_clariion checker
Patch to stop emc_clariion_checker from logging messages when probes
to snapshot LUNs occur. Notification is still available if logging is
turned up (see condlog()) but normal probing of snapshots will no
longer produce status messages. Path functionality on snapshot probes
is unchanged.
Signed-off-by: Jerry Levy <jerome.levy@emc.com>
Michael Witten [Thu, 25 Apr 2013 03:40:25 +0000 (03:40 +0000)]
Docs: multipath.conf.annotated: Document the `no_partitions' feature
The following commit added the `no_partitions' feature:
095942eb14d735af80aa7d1d9fd8d3d53dc0db70
Add 'no_partitions' feature
Date: Wed Jul 23 14:15:12 2008 +0200
Signed-off-by: Michael Witten <mfwitten@gmail.com>