multipath-tools/.git
2 months agolibmultipath: parse_vpd_pg80: fix overflow output
Martin Wilck [Mon, 24 Jun 2019 09:27:49 +0000 (11:27 +0200)]
libmultipath: parse_vpd_pg80: fix overflow output

"vpd pg80 overflow, 20/20 bytes required" looks weird. Make it clear
that actually 21 bytes are required in this case.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix parsing of non-space-terminated T10 ID
Martin Wilck [Mon, 24 Jun 2019 09:27:48 +0000 (11:27 +0200)]
libmultipath: fix parsing of non-space-terminated T10 ID

If the T10 vendor specific ID doesn't end with spaces, the last
part won't be parsed. Fix it.

Fixes: 18176202e75c "Read wwid from sysfs vpg_pg83 attribute"
Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: allow zero-padded SCSI names in parse_vpd_pg83()
Martin Wilck [Mon, 24 Jun 2019 09:27:47 +0000 (11:27 +0200)]
libmultipath: allow zero-padded SCSI names in parse_vpd_pg83()

The spec says that SCSI name strings designator length must be a multiple of
4, and that strings must be zero-terminated and zero-padded.
Fix the returned string length if the VPD input ends with multiple 0-bytes.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: parse_vpd_pg83: common code for SCSI string parsing
Martin Wilck [Mon, 24 Jun 2019 09:27:46 +0000 (11:27 +0200)]
libmultipath: parse_vpd_pg83: common code for SCSI string parsing

The three cases for eui, naa, and iqn parsing share almost all code.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: add consistent WWID overflow logging in parse_vpd_pg83
Martin Wilck [Mon, 24 Jun 2019 09:27:45 +0000 (11:27 +0200)]
libmultipath: add consistent WWID overflow logging in parse_vpd_pg83

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix parsing of SCSI name string, iqn format
Martin Wilck [Mon, 24 Jun 2019 09:27:44 +0000 (11:27 +0200)]
libmultipath: fix parsing of SCSI name string, iqn format

Do not overwrite the leading '8'.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix another WWID overflow in parse_vpd_pg83()
Martin Wilck [Mon, 24 Jun 2019 09:27:43 +0000 (11:27 +0200)]
libmultipath: fix another WWID overflow in parse_vpd_pg83()

This one is an obvious typo.

Fixes: 18176202e75c "Read wwid from sysfs vpg_pg83 attribute"
Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix possible WWID overflow in parse_vpd_pg83()
Martin Wilck [Mon, 24 Jun 2019 09:27:42 +0000 (11:27 +0200)]
libmultipath: fix possible WWID overflow in parse_vpd_pg83()

We have to check the remaining length before printing to the
output buffer, not afterwards.

Fixes: 18176202e75c "Read wwid from sysfs vpg_pg83 attribute"
Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: Fix buffer overflow in parse_vpd_pg80()
Martin Wilck [Mon, 24 Jun 2019 09:27:41 +0000 (11:27 +0200)]
libmultipath: Fix buffer overflow in parse_vpd_pg80()

We set out[len] = '\0' later, thus we should set len to no more then
out_len - 1.

Fixes: 756ef73b7197 "Separate out vpd parsing functions"
Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix parsing of VPD 83 type 1 (T10 vendor ID)
Martin Wilck [Mon, 24 Jun 2019 09:27:40 +0000 (11:27 +0200)]
libmultipath: fix parsing of VPD 83 type 1 (T10 vendor ID)

In the buffer overflow case, the code would set p_len = out_len - len - 2,
then len = len + plen = out_len - 2, and check if len >= out_len - 1,
which is never the case. Rather, set p_len = out_len - len -1, and
check the length again before appending the underscore.

Fixes: 18176202e75c "Read wwid from sysfs vpg_pg83 attribute"
Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath-tools tests: add test for VPD parsing
Martin Wilck [Mon, 24 Jun 2019 09:27:39 +0000 (11:27 +0200)]
multipath-tools tests: add test for VPD parsing

Add a test for parsing the WWID from VPDs, with focus on WWID
buffer overflow detection and handling.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath-tools tests: add strlcpy() tests
Martin Wilck [Mon, 24 Jun 2019 09:27:38 +0000 (11:27 +0200)]
multipath-tools tests: add strlcpy() tests

As we're using strlcpy quite a bit, make sure it works as designed.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agotests/hwtable: decrease log verbosity
Martin Wilck [Mon, 24 Jun 2019 09:27:37 +0000 (11:27 +0200)]
tests/hwtable: decrease log verbosity

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath-tools tests: omit timestamp in log messages
Martin Wilck [Mon, 24 Jun 2019 09:27:36 +0000 (11:27 +0200)]
multipath-tools tests: omit timestamp in log messages

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: add size argument to dm_get_uuid()
Martin Wilck [Mon, 24 Jun 2019 09:27:35 +0000 (11:27 +0200)]
libmultipath: add size argument to dm_get_uuid()

The length of the uuid field in libdm is DM_UUID_LEN, which happens
to be one byte more than our WWID_SIZE. Handle this cleanly.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: inline set_default()
Martin Wilck [Mon, 24 Jun 2019 09:27:34 +0000 (11:27 +0200)]
libmultipath: inline set_default()

This is nothing but a reimplementation of strdup(), and it causes gcc 9
warnings. Remove it.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmpathcmd: use target length for unix socket sun_path
Martin Wilck [Mon, 24 Jun 2019 09:27:32 +0000 (11:27 +0200)]
libmpathcmd: use target length for unix socket sun_path

Note that sun_path doesn't necessarily need to be 0-terminated for an abstract
socket name for ux_socket_listen(), this means we need to use memcpy to avoid
a spurious warning.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath-tools: Fix more strncpy(X, Y, size - 1) calls
Martin Wilck [Mon, 24 Jun 2019 09:27:31 +0000 (11:27 +0200)]
multipath-tools: Fix more strncpy(X, Y, size - 1) calls

The compiler emitted no warnings for these, but for consistency
it makes sense to fix them, too.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath-tools: fix more gcc 9 -Wstringop-truncation warnings
Martin Wilck [Mon, 24 Jun 2019 09:27:30 +0000 (11:27 +0200)]
multipath-tools: fix more gcc 9 -Wstringop-truncation warnings

More often than not, this means replacing strncpy() by strlcpy().

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix -Wstringop-overflow warning in merge_words()
Martin Wilck [Mon, 24 Jun 2019 09:27:29 +0000 (11:27 +0200)]
libmultipath: fix -Wstringop-overflow warning in merge_words()

Fixes the following warning from gcc 9:

In file included from /usr/include/string.h:494,
                 from dmparser.c:8:
In function ‘strncpy’,
    inlined from ‘merge_words’ at dmparser.c:41:2:
/usr/include/bits/string_fortified.h:106:10: warning: ‘__builtin_strncpy’
                 specified bound depends on the length of the source argument
                 [-Wstringop-overflow=]
  106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dmparser.c: In function ‘merge_words’:
dmparser.c:41:19: note: length computed here
   41 |  strncpy(p, word, strlen(word) + 1);
      |                   ^~~~~~~~~~~~

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: remove 'space' argument to merge_words()
Martin Wilck [Mon, 24 Jun 2019 09:27:28 +0000 (11:27 +0200)]
libmultipath: remove 'space' argument to merge_words()

merge_words() is always called with space = 1. Remove the argument.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmultipath: fix gcc -Wstringop-truncation warning in set_value()
Martin Wilck [Mon, 24 Jun 2019 09:27:27 +0000 (11:27 +0200)]
libmultipath: fix gcc -Wstringop-truncation warning in set_value()

Fixes the following warning:

In function ‘strncat’,
    inlined from ‘set_value’ at parser.c:382:3:
/usr/include/bits/string_fortified.h:136:10: warning: ‘__builtin_strncat’
    output truncated before terminating nul copying as many bytes from a
    string as its length [-Wstringop-truncation]
  136 |   return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
parser.c: In function ‘set_value’:
parser.c:382:3: note: length computed here
  382 |   strncat(alloc, str, strlen(str));
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

gcc's stringop checker expects that the size argument of strncat() is derived
from the destination, not source, side.
See https://developers.redhat.com/blog/2018/05/24/detecting-string-truncation-with-gcc-8/

Fix typo in error message along the way.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agokpartx: dasd: fix -Waddress-of-packed-member warning from gcc9
Martin Wilck [Mon, 24 Jun 2019 09:27:26 +0000 (11:27 +0200)]
kpartx: dasd: fix -Waddress-of-packed-member warning from gcc9

Fixes the following warning:

dasd.c: In function ‘read_dasd_pt’:
dasd.c:206:3: warning: converting a packed ‘volume_label_t’ {aka ‘struct
volume_label’} pointer (alignment 1) to a ‘label_ints_t’ {aka ‘unsigned int’}
pointer (alignment 4) may result in an unaligned pointer value
[-Waddress-of-packed-member]
  206 |   label_ints_t *label = (label_ints_t *) &vlabel;
      |   ^~~~~~~~~~~~

As volume_label_t is only used in read_dasd_pt(), and filled with memcpy(),
increasing its alignment should be safe.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath: call store_pathinfo with DI_BLACKLIST
Benjamin Marzinski [Mon, 3 Jun 2019 19:29:16 +0000 (14:29 -0500)]
multipath: call store_pathinfo with DI_BLACKLIST

Commit ca19f865f moved adding DI_BLACKLIST to the pathinfo flags out of
store_pathinfo(), but it didn't add it to all of the necessary callers.
Without this, store_pathinfo() callers can do unnecessary extra work,
including running the path checker on blacklisted devices. Also, running

multipathd add path <blacklisted_path_device>

will add a blacklisted path.

Fixes: ca19f865f "libmultipath: add 'cmd' as argument for get_refwwid()"
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2 months agompathpersist.8: add documentation for --batch-file (-f)
Martin Wilck [Mon, 27 May 2019 12:59:42 +0000 (14:59 +0200)]
mpathpersist.8: add documentation for --batch-file (-f)

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agompathpersist.8: fix examples in man page
Martin Wilck [Mon, 27 May 2019 12:59:41 +0000 (14:59 +0200)]
mpathpersist.8: fix examples in man page

--prout-type is ignored in the REGISTER service action. The RESERVE service
action takes a Reservation Key, not a Service action Reservation Key, as
argument. The text mentions "Service Action Reservation Key" in several places
where it means the Reservation Key. Add examples for unregistering the current
key, and the CLEAR service action. Fix formatting with longer text in the
description lines (the .TP formatting would produce bad results if the
first line needs to be broken).

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmpathpersist: don't bother with priorities
Martin Wilck [Mon, 27 May 2019 12:59:40 +0000 (14:59 +0200)]
libmpathpersist: don't bother with priorities

We send our PR commands to every active path, regardless of priority.
Thus we can save the effort to obtain priorities.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agompathpersist: initialize data structures only once
Martin Wilck [Mon, 27 May 2019 12:59:39 +0000 (14:59 +0200)]
mpathpersist: initialize data structures only once

We now have the possibility to run several PR commands in a single
mpathpersist invocation. Run initialization / discovery and teardown
only once at program invocation / exit.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmpathpersist: factor out initialization and teardown
Martin Wilck [Mon, 27 May 2019 12:59:38 +0000 (14:59 +0200)]
libmpathpersist: factor out initialization and teardown

mpath_presistent_reserve_{in,out} share a lot of common code
for initial data structure initialization (discovery) and teardown.
Factor this code out into mpath_persistent_reserve_init_vecs()
(global data structure initialization),
mpath_persistent_reserve_free_vecs (global teardown) and mpath_get_map()
(struct multipath setup for given map device).

Provide __mpath_presistent_reserve_{in,out}, which are the same
as their counterparts without leading underscores, but do not
call the global setup and teardown routines. This allows running
several PR commands in a row without having to re-initialize in
between. Because libmpathpersist is a public API, the previously
known functions don't change behavior.

Don't call path_discovery() any more during global initialization.
We rather do this lazily in the get_mpvec() call chain. dm_get_maps(),
OTOH, is part of the global initialization procedure. In get_mpvec(),
we don't delete non-matching maps any more, because we way want to
act on them later on.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agolibmpathpersist: updatepaths: deal with missing pp->udev
Martin Wilck [Mon, 27 May 2019 12:59:37 +0000 (14:59 +0200)]
libmpathpersist: updatepaths: deal with missing pp->udev

We will change the data structure initialization to a lazy
approach, where pp->udev isn't necessarily initialized
when get_mpvec() is called. Deal with it.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agompathpersist: no need to treat error close() as fatal
Martin Wilck [Mon, 27 May 2019 12:59:36 +0000 (14:59 +0200)]
mpathpersist: no need to treat error close() as fatal

Simplify code a bit.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agompathpersist: add option --batch-file (-f)
Martin Wilck [Mon, 27 May 2019 12:59:35 +0000 (14:59 +0200)]
mpathpersist: add option --batch-file (-f)

Add the option --batch-file (-f) to mpathpersist. The option argument
is a text file that is parsed line-by-line. Every line of the file is
interpreted as an additional input line for mpathpersist. The initial
"mpathpersist" on each line is optional. The '#' character denotes
a comment. '#' is only recognized after whitespace. Empty lines,
or comment lines, are allowed.

If -f is given, other command line options are parsed as usual and
commands (if any) are run before running the batch file. Inside the
batch file, the option -f is forbidden, and -v is ignored. If a command
fails, the batch processing is not aborted. The return status of
mpathpersist is 0 if all commands succeeded, and the status of the
first failed command otherwise.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agompathpersist: call usage() just once on return
Martin Wilck [Mon, 27 May 2019 12:59:34 +0000 (14:59 +0200)]
mpathpersist: call usage() just once on return

This simplifies further changes.

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipathd: handle NULL return from genhelp_handler
Benjamin Marzinski [Fri, 17 May 2019 16:14:10 +0000 (11:14 -0500)]
multipathd: handle NULL return from genhelp_handler

parse_cmd() wasn't checking if genhelp_handler() returned NULL. It was simply
assuming that it got a string. On NULL, it now returns an error. Found by
Coverity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2 months agomultipathd: fix REALLOC_REPLY with max length reply
Benjamin Marzinski [Fri, 17 May 2019 16:14:09 +0000 (11:14 -0500)]
multipathd: fix REALLOC_REPLY with max length reply

Commit cd5a9797e added code to REALLOC_REPLY() that intended to stop
growing the reply buffer after it reached a maximum size. However this
code didn't stop the realloc() from happening. Worse, if the realloc()
failed, multipathd would double free the reply buffer. Found by
Coverity.

Fixes: cd5a9797e "libmpathcmd(coverity): limit reply length"
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2 months agokpartx: fail if dup() of dasd file descriptor fails
Benjamin Marzinski [Fri, 17 May 2019 16:14:08 +0000 (11:14 -0500)]
kpartx: fail if dup() of dasd file descriptor fails

If kpartx fails to create a copy of the dasd file descriptor, it should
fail, instead of treating the error value as a valid fd. Found by
coverity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
2 months agolibmultipath: get_prio(): really don't reset prio for inaccessible paths
Martin Wilck [Thu, 16 May 2019 07:10:24 +0000 (09:10 +0200)]
libmultipath: get_prio(): really don't reset prio for inaccessible paths

As pointed out by Ben Marzinski, my previous patch ebbb56f2 doesn't do what
it pretends to do.

Fixes: ebbb56f2 "libmultipath: get_prio(): don't reset prio for inaccessible
paths"

2 months agomultipath-tools: format correctly maintainer info in hwtable
Xose Vazquez Perez [Tue, 7 May 2019 18:53:09 +0000 (20:53 +0200)]
multipath-tools: format correctly maintainer info in hwtable

Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
2 months agomultipath-tools: document missing mpathpersist flags in help output
Xose Vazquez Perez [Thu, 2 May 2019 21:49:01 +0000 (23:49 +0200)]
multipath-tools: document missing mpathpersist flags in help output

Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
2 months agomultipath-tools: document missing multipathd option at man page
Xose Vazquez Perez [Thu, 2 May 2019 19:26:19 +0000 (21:26 +0200)]
multipath-tools: document missing multipathd option at man page

 -w was added in 2858c60a34527401381ea6b13cf316c3e5064383

Cc: Benjamin Marzinski <bmarzins@redhat.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
2 months agomultipath-tools: document missing multipath flags in help output
Xose Vazquez Perez [Thu, 2 May 2019 19:24:08 +0000 (21:24 +0200)]
multipath-tools: document missing multipath flags in help output

Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
2 months agomultipathd: fix client response for socket activation
Martin Wilck [Tue, 30 Apr 2019 22:41:38 +0000 (00:41 +0200)]
multipathd: fix client response for socket activation

When a client wakes up multipathd through the socket, it is likely that the
ux listener responds to client requests before multipathd startup has
completed. This means that client commands such as "show paths" or "show
topology" return success with an empty result, which is quite confusing.

Therefore, in the ux listener, don't start answering client requests while
the daemon is configuring. Rather, wait for some other daemon state. We
can't wait hard, because the ux listener must also handle signals. So just
wait for some short time, and poll again.

This has the side effect that command responses for commands that don't
require full initialization, such as "show wildcards", "show config" or
"shutdown", are also delayed until the configuration stage has completed.
But that seems to be a relatively cheap price to pay for getting the
expected response for other commands. To avoid this side effect, the client
handling would need to be rewritten such that the uxlsnr thread would have
a means to "know" which commands require the configuration stage to
complete and which do not.

v2: Removed an unrelated, unnecessary hunk in child().

Signed-off-by: Martin Wilck <mwilck@suse.com>
2 months agomultipath-tools: document missing kpartx options in man and output
Xose Vazquez Perez [Sat, 27 Apr 2019 01:30:45 +0000 (03:30 +0200)]
multipath-tools: document missing kpartx options in man and output

Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
2 months agomultipath -u: test socket connection in non-blocking mode
Martin Wilck [Wed, 24 Apr 2019 09:07:59 +0000 (11:07 +0200)]
multipath -u: test socket connection in non-blocking mode

Since commit d7188fcd "multipathd: start daemon after udev trigger",
multipathd startup is delayed during boot until after "udev settle"
terminates. But "multipath -u" is run by udev workers for storage devices,
and attempts to connect to the multipathd socket. This causes a start job
for multipathd to be scheduled by systemd, but that job won't be started
until "udev settle" finishes. This is not a problem on systems with 129 or
less storage units, because the connect() call of "multipath -u" will
succeed anyway. But on larger systems, the listen backlog of the systemd
socket can be exceeded, which causes connect() calls for the socket to
block until multipathd starts up and begins calling accept(). This creates
a deadlock situation, because "multipath -u" (called by udev workers)
blocks, and thus "udev settle" doesn't finish, delaying multipathd
startup. This situation then persists until either the workers or "udev
settle" time out. In the former case, path devices might be misclassified
as non-multipath devices by "multipath -u".

Fix this by using a non-blocking socket fd for connect() and interpret the
errno appropriately.

This patch reverts most of the changes from commit 8cdf6661 "multipath:
check on multipathd without starting it". Instead, "multipath -u" does
access the socket and start multipath again (which is what we want IMO),
but it is now able to detect and handle the "full backlog" situation.

Signed-off-by: Martin Wilck <mwilck@suse.com>
V2:

Use same error reporting convention in __mpath_connect() as in
mpath_connect() (Hannes Reinecke). We can't easily change the latter,
because it's part of the "public" libmpathcmd API.

4 months agolibmultipath: get_prio(): don't reset prio for inaccessible paths
Martin Wilck [Thu, 11 Apr 2019 10:49:23 +0000 (12:49 +0200)]
libmultipath: get_prio(): don't reset prio for inaccessible paths

pathinfo() doesn't call get_prio() if a path is down, now keeping the old
priority value. But if a path becomes inaccessible between the state check
and the get_prio() call, retrieving the priority will likely fail, and the
path priority will be reset to PRIO_UNDEF. This makes it timing-dependent
how the priority of a failed path is treated. Fix that by checking the path
state in get_prio() if an error occurs, and not touching pp->priority if
the path is in inaccessible state. A checker_check() call would be too
much here, but a quick path_offline() check seems appropriate.

Keep the previous behavior for the case where getting the priority fails
although the path is apparently healthy. This is presumably a very rare
condition, in which it seems actually wrong to preserve the old prio value.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: ana prioritizer: decrease log level
Martin Wilck [Thu, 11 Apr 2019 10:49:22 +0000 (12:49 +0200)]
libmultipath: ana prioritizer: decrease log level

The "ana state = ..." printed in the get_prio() call path
is irritating, because by it's similarity to the checker
messages it suggests to the reader that an "ana checker",
which doesn't exist, was in place. Moreover, the "ana prio ="
message is also only printed at level 4.

Decrease the verbosity of this message.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agoRevert "Set priority to '0' for PATH_BLOCKED or PATH_DOWN"
Martin Wilck [Thu, 11 Apr 2019 10:49:21 +0000 (12:49 +0200)]
Revert "Set priority to '0' for PATH_BLOCKED or PATH_DOWN"

This reverts commit ce8d707c4235860373238dea6491a77a931d4c9f.

In check_path(), we don't touch path priority if a path is down.
But when pathinfo(DI_CHECKER) is called in down state, we reset
the priority to 0. This is inconsistent.

Commit ce8d707 was about maps being rejected during multipath startup
because of undefined priorities. Since commit 94036e3 "libmultipath:
don't reject maps with undefined prio", such maps aren't rejected
any more, thus we can skip resetting the priority to 0.

Note that when we calculate path group priorities, the prio of
paths which are not UP or GHOST are ignored anyway, so this
change will not cause changes wrt PG priorities or PG ordering.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Cc: Hannes Reinecke <hare@suse.de>
4 months agolibmultipath: group_by_prio: fix signedness bug
Martin Wilck [Thu, 11 Apr 2019 10:49:20 +0000 (12:49 +0200)]
libmultipath: group_by_prio: fix signedness bug

pp->priority can be negative, so we we shouldn't compare
it with an unsigned int.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agomultipath: check on multipathd without starting it
Benjamin Marzinski [Thu, 18 Apr 2019 17:49:46 +0000 (12:49 -0500)]
multipath: check on multipathd without starting it

When "multipath -u" is run, it checks if multipathd is running.
Currently it does this by trying to connect to the mutipathd socket.
This can cause problems during boot.  The multipathd.socket systemd unit
file will cause "multipath -u" to wait until multipathd has been started
before continuing.  If there is a lot of activity on the system,
multipathd may not start up immediately, causing block device
initialization to be delayed, potentially until after systemd times
waiting for the device.  To avoid this, multipath now checks if
multipathd is running by reading /run/multipathd.pid and checking the
/proc/<pid>/comm to verify that multipathd is really running with this
pid. This avoids forcing "multipath -u" to wait on multipathd starting
up.

As an alternative to this patch, multipath could simply switch the order
of the calls to systemd_service_enabled() and mpath_connect(). This would
make multipath only try to connect with multipathd if it wasn't enabled in
systemd, so that it wouldn't autostart.

Another alternative is to do away with multipathd.socket. Since multipathd
needs to always be running in order to get uevents, there isn't much value
in having it autoactivate when it gets an interactive command.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agoBump version to 0.8.1 0.8.1
Christophe Varoqui [Thu, 18 Apr 2019 11:13:10 +0000 (13:13 +0200)]
Bump version to 0.8.1

4 months agomultipath-tools: reorder vendors in hwtable
Xose Vazquez Perez [Sat, 16 Mar 2019 23:05:06 +0000 (00:05 +0100)]
multipath-tools: reorder vendors in hwtable

Xio was acquired by Violin, and add FlashSystem 9100 to Storwize in comments.

Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
4 months agomultipath-tools: maintain the uniformity in the multipath.conf.5 page
Xose Vazquez Perez [Sat, 16 Mar 2019 22:25:56 +0000 (23:25 +0100)]
multipath-tools: maintain the uniformity in the multipath.conf.5 page

Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
4 months agolibmultipath: hwtable: add Lenovo DE series
Martin Wilck [Mon, 18 Mar 2019 11:24:45 +0000 (12:24 +0100)]
libmultipath: hwtable: add Lenovo DE series

I got this from Steven.

Cc: Steve.Schremmer@netapp.com
Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
Cc: xose.vazquez@gmail.com
Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: check_rdac(): pre-check in hwtable
Martin Wilck [Mon, 18 Mar 2019 11:24:44 +0000 (12:24 +0100)]
libmultipath: check_rdac(): pre-check in hwtable

Currently check_rdac() always runs an SG_IO for VPD 0xc9 to check
if the storage supports RDAC. This is an extra IO, and may cause
annoying error messages on the storage side for non-RDAC arrays.
Do not use the RDAC override for arrays that have legacy configuration
to use a checker other than "rdac". For "unkown" devices with no checker
configured in either the hwtable or multipath.conf, the VPD call will
be tried.

Cc: Steve.Schremmer@netapp.com
Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: tidy up do_set_from_hwe() with statement expression
Martin Wilck [Mon, 18 Mar 2019 11:24:43 +0000 (12:24 +0100)]
libmultipath: tidy up do_set_from_hwe() with statement expression

propsel.c has a lot of "funky" macros making assumptions about
variable and label names in callers. This one is particularly
ugly. As a first tidy-up step, split out a statement expression
which can be called cleanly from code that doesn't have said
variables and labels.

Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: don't link alua_rtpg.o twice
Martin Wilck [Mon, 18 Mar 2019 11:24:42 +0000 (12:24 +0100)]
libmultipath: don't link alua_rtpg.o twice

We link this already to libmultipath.so. Therefore, no need
to link ti to libprioalua.so, too.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: lazy tpgs probing
Martin Wilck [Mon, 18 Mar 2019 11:24:41 +0000 (12:24 +0100)]
libmultipath: lazy tpgs probing

Provide a "getter" function that can be used to probe tpgs lazily.
This way we don't need to send an RTPG in the pathinfo() call
chain (e.g. in "multipath -u"). With this in place, no "user"
code should access pp->tpgs directly any more.

Moreover, in select_prio(), in the case where the alua checker
was statically configured, rather then calling into the alua
code directly, use get_tpgs(), which does all the proper error
checking, and fall back to const prio if it fails.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: detect_alua(): use system timeout
Martin Wilck [Mon, 18 Mar 2019 11:24:40 +0000 (12:24 +0100)]
libmultipath: detect_alua(): use system timeout

This is not the path checker - we don't need to use the
configured checker timeout here. This makes it possible to
call this function without a current (struct config *).

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: constify sysfs_get_timeout()
Martin Wilck [Mon, 18 Mar 2019 11:24:39 +0000 (12:24 +0100)]
libmultipath: constify sysfs_get_timeout()

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: alua: try to retrieve inquiry data from sysfs
Martin Wilck [Mon, 18 Mar 2019 11:24:38 +0000 (12:24 +0100)]
libmultipath: alua: try to retrieve inquiry data from sysfs

This can avoid IO while configuring the path prioritizer.
The alua prioritizer avoids reading from sysfs for a reason
(see commit 7e2f46d3), but this applies only to RTPG / STPG,
not to INQUIRY calls.

Cc: Steve.Schremmer@netapp.com
Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: alua: make API more consistent
Martin Wilck [Mon, 18 Mar 2019 11:24:37 +0000 (12:24 +0100)]
libmultipath: alua: make API more consistent

Let all alua functions take "const struct path *" as first
argument.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: add sysfs_get_inquiry()
Martin Wilck [Mon, 18 Mar 2019 11:24:36 +0000 (12:24 +0100)]
libmultipath: add sysfs_get_inquiry()

Provide a utility function to retrieve inquiry data from
sysfs, like we do for VPDs.

Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agomultipathd: protect all access to running_state
Martin Wilck [Thu, 11 Apr 2019 10:27:14 +0000 (12:27 +0200)]
multipathd: protect all access to running_state

Chonyun Wu's latest patch has shown that the handling of the daemon
state variable running_state is racy and difficult to get right. It's
not a good candidate for a "benign race" annotation. So, as a first
step to sanitizing it, make sure all accesses to the state variable
are protected by config_lock.

The patch also replaces "if" with "while" in several places where the
code was supposed to wait until a certain state was reached. It's
important that DAEMON_SHUTDOWN terminates all loops of this kind.

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agomultipathd: fix daemon not really shutdown
Chongyun Wu [Thu, 11 Apr 2019 10:27:13 +0000 (12:27 +0200)]
multipathd: fix daemon not really shutdown

Test environment: 25 hosts, each host have more than 100 luns,
each lun have two paths.
Some times when we try to ceate new multipath will encounter "could
not create uxsock:98" but the multipathd still running not shutdown
and can't response any multipathd commands also.

After reproduce this issue and debug, found below fixes might work:
(1) set_config_state() after pthread_cond_timedwait() other threads
might changed the running_state from DAEMON_SHUTDOWN to other status
like DAEMON_IDLE, which will make the shutdown process stopped.
I found logs to prove this really happened, so we need add judgement
here too.

(2) [this part removed by mwilck]

Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipath-tools: Build: properly parse systemd's version
Dominique Leuenberger [Thu, 11 Apr 2019 10:27:12 +0000 (12:27 +0200)]
multipath-tools: Build: properly parse systemd's version

Since systemd 241, systemctl --version no longer 'just' prints out the
version, but gives more information like git commit ref and whatnot. In
it's shortest form, it gives something like "systemd 241 (241)", which when
passed as parameter "-DUSE_SYSTEMD=241 (241)" results in shell errors.

Try to retrieve the version from pkg-config instead, and if that fails,
discard anything after the first number in "systemctl --version" output.

Acked-by: Benjamin Marzinski <bmarzins@redhat.com>
Acked-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
4 months agolibmultipath: silence dm_is_mpath error messages
Benjamin Marzinski [Sat, 30 Mar 2019 06:06:06 +0000 (01:06 -0500)]
libmultipath: silence dm_is_mpath error messages

When "multipath -F" is run, dm_is_mpath was printing error messages
about partition devices, because they had already been removed, when
it checked.  Lower the error logging level so this doesn't happen on
the default verbosity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: Don't use fallback code after getting wwid
Benjamin Marzinski [Sat, 30 Mar 2019 06:06:05 +0000 (01:06 -0500)]
multipathd: Don't use fallback code after getting wwid

The fallback code is necessary to set up mutipath devices, if multipath
temporarily can't get the information from udev.  However, once the
devices are set up, udev is the definitive source of this information.

The wwid gotten from the fallback code and the udev code should always
be the same, in which case it doesn't matter where we get the wwid
from. But if they are different, it's important to try to do the
right thing.

Working under the assumption that udev will either never give us this
information, or that it usually will. multipath should assume that if
there are multiple paths to a device, either they will all never get
a wwid from udev, or some of them will likely already have gotten the
correct wwid from udev.  In this case, we should fix this as soon as
possible.

This does mean that devices where udev will never give out the uuid
will not notice if the wwid changes, but that's a small price to pay
for doing the right thing most of the time.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: ignore "disable_changed_wwids"
Martin Wilck [Sat, 30 Mar 2019 06:06:04 +0000 (01:06 -0500)]
multipathd: ignore "disable_changed_wwids"

This option has no effect any more.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: remove "wwid_changed" path attribute
Martin Wilck [Sat, 30 Mar 2019 06:06:03 +0000 (01:06 -0500)]
multipathd: remove "wwid_changed" path attribute

This is now not needed any more.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: handle changed wwids by removal and addition
Benjamin Marzinski [Sat, 30 Mar 2019 06:06:02 +0000 (01:06 -0500)]
multipathd: handle changed wwids by removal and addition

If a path's WWID changes, it's not necessarily failed. But it certainly
has to be removed from an existing map, otherwise data corruption is
imminent. Instead of keeping the path in the map, failing it, and
remembering the "changed WWID" state, this patch simply removes and
re-adds the path.

This is patch is heavily based on the previous patch of the same name
by Martin Wilck.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agolibmulitpath: cleanup uid_fallback code
Benjamin Marzinski [Sat, 30 Mar 2019 06:06:01 +0000 (01:06 -0500)]
libmulitpath: cleanup uid_fallback code

Instead of always calling uid_fallback() if the configured method to get
the uid failed, get_uid now checks if the path supports fallbacks and if
all the retriggers have occurred. If so, it calls uid_fallback(), which
just attempts to get the uid using the appropriate fallback method. None
of these changes should make the code function any differently.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agolibmultipath: add get_uid fallback code for NVMe devices
Benjamin Marzinski [Sat, 30 Mar 2019 06:06:00 +0000 (01:06 -0500)]
libmultipath: add get_uid fallback code for NVMe devices

If multipath can't get the uid for NVMe devices from udev, it can get it
directly from sysfs.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipath.conf: add missing options to man page
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:59 +0000 (01:05 -0500)]
multipath.conf: add missing options to man page

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: use update_path_groups instead of reload_map
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:58 +0000 (01:05 -0500)]
multipathd: use update_path_groups instead of reload_map

reload_map() doesn't do the work to sync the state after reloading the
map.  Instead of calling it directly, cli_reload() and uev_update_path()
should call update_path_groups(), which calls reload_map() with all the
necessary syncing.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agolibmutipath: continue to use old state on PATH_PENDING
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:57 +0000 (01:05 -0500)]
libmutipath: continue to use old state on PATH_PENDING

When pathinfo() sets pp->state to PATH_PENDING, it can cause problems
with path checking.  It should act more like check_path(). When
check_path() sees a new state of PATH_PENDING, it doesn't update the
path state at all, so a path's old state is normally never PATH_PENDING.

As and example of the problems of setting a path to PATH_PENDING, If
check_path() sets a path's state to PATH_UP, then a call to pathinfo()
sets the state to PATH_PENDING, and then another call the check_path()
sets the state to PATH_DOWN, multipathd won't fail the path in the
kernel. Also, if a path's state is PATH_PENDING, and nr_active is
recalculated, that path will count as down, even if the state was
previously PATH_UP. If a path already has a state of PATH_WILD or
PATH_UNCHECKED, changing it to PATH_PENDING won't hurt anything, and it
will help anyone who sees it know what's actually happening. But
otherwise, pathinfo() should leave the previous state alone.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: ignore failed wwid recheck
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:56 +0000 (01:05 -0500)]
multipathd: ignore failed wwid recheck

If disable_changed_wwids is set, when multipathd gets a change event on
a path, it verifies that the wwid hasn't changed in uev_update_path().
If get_uid() failed, uev_update_path treated this as a wwid change to 0.
This could cause paths to suddenly be dropped due to an issue with
getting the wwid.  Even if get_uid() failed because the path was down,
it no change uevent happend when it later became active, multipathd
would continue to ignore the path. Also, scsi_uid_fallback() clears the
failure return if it doesn't attempt to fallback, causing get_uid()
to return success, when it actually failed.

Multipathd should neither set nor clear wwid_changed if get_uid()
returned failure. Also, scsi_uid_fallback() should retain the old return
value if it doesn't attempt to fallback.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: Fix miscounting active paths
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:55 +0000 (01:05 -0500)]
multipathd: Fix miscounting active paths

When multipathd gets a change uevent, it calls pathinfo with DI_NOIO.
This sets the path state to the return value of path_offline(). If a
path is in the PATH_DOWN state but path_offline() returns PATH_UP, when
that path gets a change event, its state will get moved to PATH_UP
without either reinstating the path, or reloading the map.  The next
call to check_path() will move the path back to PATH_DOWN. Since
check_path() simply increments and decrements nr_active instead of
calculating it based on the actual number of active paths, nr_active
will get decremented a second time for this failed path, potentially
putting the multipath device into recovery mode.

This commit does two things to avoid this situation. It makes the
DI_NOIO flag only set pp->state in pathinfo() if DI_CHECKER is also set.
This isn't set in uev_update_path() to avoid changing the path state in
this case.  Also, to guard against pp->state getting changed in some
other code path without properly updating the map state, check_path()
now calls set_no_path_retry, which recalculates nr_active based on the
actual number of active paths, and makes sure that the queue_if_no_path
value in the features line is correct.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agolibmultipath: fix marginal_paths nr_active check
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:54 +0000 (01:05 -0500)]
libmultipath: fix marginal_paths nr_active check

Marginal paths are SHAKY, so they don't count towards the number of
active paths. poll_io_err_stat() shouldn't automatically reinstate a
marginal path if there already is an active path.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agolibmultipath: fix marginal paths queueing errors
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:53 +0000 (01:05 -0500)]
libmultipath: fix marginal paths queueing errors

The current marginal paths code tries to enqueue paths for io error
checking when multipathd receives a uevent on path failure. This can run
into a couple of problems. First, this uevent could happen before or
after multipathd actually fails the path, so simply checking nr_active
doesn't tell us if this is the last active path. Also, The code to fail
the path in enqueue_io_err_stat_by_path() doesn't ever update the path
state. This can cause the path to get failed twice, temporarily leading
to incorrect nr_active counts. Further, The point when multipathd should
care if this is the last active path is when the path has come up again,
not when it goes down. Lastly, if the path is down, it is often
impossible to open the path device, causing setup_directio_ctx() to
fail, which causes multipathd to skip io error checking and mark the
path as not marginal.

Instead, multipathd should just make sure that if the path is marginal,
it gets failed in the uevent, so as not to race with the checkerloop
thread. Then, when the path comes back up, check_path() can enqueue it,
just like it does for paths that need to get rechecked. To make it
obvious that the state PATH_IO_ERR_IN_POLLING_RECHECK and the function
hit_io_err_recheck_time() now apply to paths waiting to be enqueued for
the first time as well, I've also changed their names to
PATH_IO_ERR_WAITING_TO_CHECK and need_io_err_check(), respectively.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agomultipathd: cleanup marginal paths checking timers
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:52 +0000 (01:05 -0500)]
multipathd: cleanup marginal paths checking timers

When a path gets recovered in hit_io_err_recheck_time(), it will
continue running in check_path(), so there is no reason to schedule
another path check as soon as possible (since one is currently
happening).

Also, there isn't much point in restarting the io err stat checking when
the path is down, so hit_io_err_recheck_time() should only be run when
the path is up. Downed marginal paths can be treated just like any other
downed path.

Finally, there is no reason to set reset pp->io_err_dis_reinstate_time
when we decide to enqueue a path. Either th enqueue will fail and the
path will get recovered, or it will succeed, and we won't check the
reinstate time again until poll_io_err_stat() marks the path as needing
a requeue.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agolibmultipath: handle existing paths in marginal_path enqueue
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:51 +0000 (01:05 -0500)]
libmultipath: handle existing paths in marginal_path enqueue

If the path that enqueue_io_err_stat_by_path() is trying to add
is already on the list, just return success. There's no reason
to fail in this case.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
4 months agoBZ 1668693: disable user_friendly_names for NetApp
Benjamin Marzinski [Sat, 30 Mar 2019 06:05:50 +0000 (01:05 -0500)]
BZ 1668693: disable user_friendly_names for NetApp

NetApp has tools that rely on devices using WWID names. To avoid
breaking these, NetApp devices should continue to use WWID names, even
if the default config is set to enable user_friendly_names. If users
want to use user_friendly_names on NetApp devices, the must specifically
override the device config.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 months agoBump version to 0.8.0 0.8.0
Christophe Varoqui [Thu, 14 Feb 2019 17:55:29 +0000 (18:55 +0100)]
Bump version to 0.8.0

6 months agomultipathd: don't resend change events for unknown devices
Benjamin Marzinski [Wed, 13 Feb 2019 22:55:08 +0000 (16:55 -0600)]
multipathd: don't resend change events for unknown devices

If multipath fails to get the wwid for a device, and the device is
of an unknown type (pp->bus == SYSFS_BUS_UNDEF), don't send change
events. Instead, assume that the device was not meant to be used
and skip it.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 months agomultipathd: fix pp->initialized state ping-ponging
Benjamin Marzinski [Wed, 13 Feb 2019 22:55:07 +0000 (16:55 -0600)]
multipathd: fix pp->initialized state ping-ponging

When a multipath device fails to get a wwid in pathinfo, it moves to the
INIT_MISSING_UDEV state. After a device in this state sends
retrigger_tries change uevents in check_path(), it moves to the
INIT_FAILED state.  However, when check_path() is run on a device in
INIT_FAILED, it can call pathinfo, which will set the path back
into INIT_MISSING_UDEV if it cannot get a wwid.  The next call to
check_path() will put the path back into INIT_FAILED.  The device will
continue to ping-pong between these states.

To solve this a new pp->initialized state has been added INIT_NEW.  New
path devices start in this state, instead of INIT_FAILED. INIT_NEW and
INIT_FAILED are treated exactly the same, with one exception. A device
in INIT_FAILED cannot transition back to INIT_MISSING_UDEV.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 months agomultipath: blacklist zram devices
Benjamin Marzinski [Wed, 13 Feb 2019 22:55:06 +0000 (16:55 -0600)]
multipath: blacklist zram devices

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
6 months agomultipathd: avoid null pointer dereference in LOG_MSG
Martin Wilck [Wed, 13 Feb 2019 22:55:05 +0000 (16:55 -0600)]
multipathd: avoid null pointer dereference in LOG_MSG

LOG_MSG() will dereference pp->mpp. Commit cb5ec664 added a call to
LOG_MSG() before the check for (!pp->mpp) in check_path.  This can cause
multipathd to crash.  LOG_MSG() should check the fields before dereferencing
them. Make checker_selected() an inline function to allow the compiler
to optimize away the usually redundant test "if (&checker->pp != NULL)".

Also, checker_message() should fail to a generic message if c->cls isn't
set (which means that a checker hasn't been selected).

Fixes: cb5ec664 (multipathd: check_path: improve logging for "unusable
                 path" case)
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agomultipath-tools: git should ignore auto-generated files
Xose Vazquez Perez [Mon, 14 Jan 2019 17:12:19 +0000 (18:12 +0100)]
multipath-tools: git should ignore auto-generated files

blacklist libmultipath/nvme-ioctl.[ch]

Cc: Martin Wilck <mwilck@suse.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
7 months agolibmultipath(coverity): fix possible NULL dereference
Martin Wilck [Tue, 8 Jan 2019 22:54:09 +0000 (23:54 +0100)]
libmultipath(coverity): fix possible NULL dereference

coverity warns that recv_packet may set reply to NULL.

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmultipath/foreign(coverity): retval check in snprint_nvme_path
Martin Wilck [Tue, 8 Jan 2019 22:54:08 +0000 (23:54 +0100)]
libmultipath/foreign(coverity): retval check in snprint_nvme_path

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmpathpersist(coverity): range checking for PRIN length
Martin Wilck [Tue, 8 Jan 2019 22:54:07 +0000 (23:54 +0100)]
libmpathpersist(coverity): range checking for PRIN length

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agomultipathd(coverity): check retval clock_gettime()
Martin Wilck [Tue, 8 Jan 2019 22:54:06 +0000 (23:54 +0100)]
multipathd(coverity): check retval clock_gettime()

Checking this is pointless, as we'd bail out early in
pthread_cond_init_mono if CLOCK_MONOTONIC was unsupported, and this
is the only error condition of clock_gettime worth checking.
Do it anyway to make coverity feel better.

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmultipath(coverity): fix "enum misuse" for find_multipaths
Martin Wilck [Tue, 8 Jan 2019 22:54:05 +0000 (23:54 +0100)]
libmultipath(coverity): fix "enum misuse" for find_multipaths

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmultipath(coverity): fix int overflow in sysfs_set_scsi_tmo
Martin Wilck [Tue, 8 Jan 2019 22:54:04 +0000 (23:54 +0100)]
libmultipath(coverity): fix int overflow in sysfs_set_scsi_tmo

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmpathcmd(coverity): limit reply length
Martin Wilck [Tue, 8 Jan 2019 22:54:03 +0000 (23:54 +0100)]
libmpathcmd(coverity): limit reply length

coverity warned about tainted input data.

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmultipath(coverity): fix apparent overflow
Martin Wilck [Tue, 8 Jan 2019 22:54:02 +0000 (23:54 +0100)]
libmultipath(coverity): fix apparent overflow

"preferred_path" contains always "0" or "1".

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmultipath(coverity): make sure readlink result is 0-terminated
Martin Wilck [Tue, 8 Jan 2019 22:54:01 +0000 (23:54 +0100)]
libmultipath(coverity): make sure readlink result is 0-terminated

Coverity warned that readlink() results aren't necessarily 0-terminated.

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agokpartx(coverity): fix apparent out-of-bounds access
Martin Wilck [Tue, 8 Jan 2019 22:54:00 +0000 (23:54 +0100)]
kpartx(coverity): fix apparent out-of-bounds access

This was a false positive.

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agolibmultipath(coverity): cleanup dup usage in execute_program()
Martin Wilck [Tue, 8 Jan 2019 22:53:59 +0000 (23:53 +0100)]
libmultipath(coverity): cleanup dup usage in execute_program()

coverity complained about resource leakage here.

Signed-off-by: Martin Wilck <mwilck@suse.com>
7 months agokpartx(coverity): fix resource leak warning
Martin Wilck [Tue, 8 Jan 2019 22:53:58 +0000 (23:53 +0100)]
kpartx(coverity): fix resource leak warning

This was an easy-to-fix false positive.

Signed-off-by: Martin Wilck <mwilck@suse.com>