multipath-tools/.git
8 years agomultipathd: lock vectors during initial configuration
Hannes Reinecke [Tue, 8 Jan 2013 13:54:19 +0000 (14:54 +0100)]
multipathd: lock vectors during initial configuration

During initial configuration the CLI thread is already running,
so we need to lock the vectors here to not race with the
'reconfigure' CLI command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agomultipathd: crash in reconfigure CLI command
Hannes Reinecke [Tue, 8 Jan 2013 13:54:18 +0000 (14:54 +0100)]
multipathd: crash in reconfigure CLI command

The 'reconfigure' CLI command doesn't take the vector lock,
so if multipathd is processing a table / udev event at the
same time it'll crash.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agomultipathd: sighandlers might use uninitialized gvecs
Hannes Reinecke [Tue, 8 Jan 2013 13:54:17 +0000 (14:54 +0100)]
multipathd: sighandlers might use uninitialized gvecs

gvecs are initialized after signal handlers, which in turn
might access the vectors.
So the signal handlers might access uninitialized variables.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agomultipathd deadlocks during restart
Hannes Reinecke [Tue, 8 Jan 2013 13:54:16 +0000 (14:54 +0100)]
multipathd deadlocks during restart

During restart multipathd might deadlock as the uevent handler
is missing a cleanup handler. Thus the thread might be terminated
while it still holds the vector lock.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agomultipathd: Ignore errors when creating pidfile
Hannes Reinecke [Tue, 8 Jan 2013 13:54:15 +0000 (14:54 +0100)]
multipathd: Ignore errors when creating pidfile

We can use CLI commands to communicate with the daemon,
so we don't need the pidfile for correct operation.
Hence any errors from creating the pidfile can be safely
ignored.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoClarify dev_loss_tmo capping in multipath.conf.5
Hannes Reinecke [Tue, 8 Jan 2013 13:54:14 +0000 (14:54 +0100)]
Clarify dev_loss_tmo capping in multipath.conf.5

The linux kernel will not allow any dev_loss_tmo setting larger
than 300 if fast_io_fail is not set. So we should document this
in the manpage.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoRemove a conflict resolution left-over in multipath.conf.5
Christophe Varoqui [Tue, 8 Jan 2013 23:24:20 +0000 (00:24 +0100)]
Remove a conflict resolution left-over in multipath.conf.5

8 years agomultipath.conf.5: Clarify dev_loss_tmo settings
Hannes Reinecke [Tue, 8 Jan 2013 13:54:13 +0000 (14:54 +0100)]
multipath.conf.5: Clarify dev_loss_tmo settings

We need to document that dev_loss_tmo is in fact modified
by no_path_retry.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agomultipath.init.suse: Update usage message
Hannes Reinecke [Tue, 8 Jan 2013 13:54:12 +0000 (14:54 +0100)]
multipath.init.suse: Update usage message

The usage message in multipath.init.suse doesn't list 'reload'.
And 'reload' is a misnomer, as it's actually a restart.
So rename 'reload' to 'restart' and add to the usage message.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoSyntax error in /etc/init.d/boot.multipath
Hannes Reinecke [Tue, 8 Jan 2013 13:54:11 +0000 (14:54 +0100)]
Syntax error in /etc/init.d/boot.multipath

The multipath init script has a syntax error in line 113.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoMake 'allocated' an integer in vector.h
Hannes Reinecke [Tue, 8 Jan 2013 13:54:10 +0000 (14:54 +0100)]
Make 'allocated' an integer in vector.h

I don't trust the programmers here, as we're unconditionally
decreasing the 'allocated' setting on vector_free().
So better make that an integer to catch underflows.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoUse VECTOR_SIZE() defines
Hannes Reinecke [Tue, 8 Jan 2013 13:54:09 +0000 (14:54 +0100)]
Use VECTOR_SIZE() defines

The size of a vector slot might be larger than one, so we should
be using the VECTOR_SIZE() define everywhere.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoFix race condition in stop_waiter_thread()
Hannes Reinecke [Tue, 8 Jan 2013 13:54:08 +0000 (14:54 +0100)]
Fix race condition in stop_waiter_thread()

The signal handler might run before we had a chance to
set the 'waiter' context to '0', so better do it previously.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoDouble free in disassemble_map()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:59 +0000 (14:53 +0100)]
Double free in disassemble_map()

Label 'out1' already frees 'word'; no need to do it here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agolibmultipath: Print out uevent sequence number
Hannes Reinecke [Tue, 8 Jan 2013 13:54:07 +0000 (14:54 +0100)]
libmultipath: Print out uevent sequence number

For debugging we should be printing out the sequence number.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoClean up uevent queue on shutdown
Hannes Reinecke [Tue, 8 Jan 2013 13:54:06 +0000 (14:54 +0100)]
Clean up uevent queue on shutdown

During shutdown there might be some unprocessed events
in the queue. So clear them up.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoUpdate 'no_path_retry' correctly for failed paths
Hannes Reinecke [Tue, 8 Jan 2013 13:54:05 +0000 (14:54 +0100)]
Update 'no_path_retry' correctly for failed paths

The bug is triggered if path failed event is received by multipathd after all
paths have been already marked as failed. Surprisingly enough, it seems to
happen quite often; colleague of mine who tested this hit this bug every time.

Here is event sequence that explains this bug. I left some messages for
clarity; full log is available on request. We have completed initialization and
set feature queue_if_no_path for map CX_201 by virtue of using no_path_retry >
0.

Aug 31 10:49:09 | CX_201: devmap event #18
Aug 31 10:49:09 | CX_201: discover
Aug 31 10:49:09 | CX_201: rr_weight = 1 (internal default)
Aug 31 10:49:09 | CX_201: pgfailback = -2 (controller setting)
Aug 31 10:49:09 | CX_201: no_path_retry = 2 (controller setting)
Aug 31 10:49:09 | pg_timeout = NONE (internal default)
Aug 31 10:49:09 | 65:192: mark as failed
Aug 31 10:49:09 | CX_201: remaining active paths: 3
Aug 31 10:49:09 | 8:192: mark as failed
Aug 31 10:49:09 | CX_201: remaining active paths: 2
Aug 31 10:49:09 | CX_201: devmap event #19
Aug 31 10:49:09 | CX_201: discover
Aug 31 10:49:09 | CX_201: rr_weight = 1 (internal default)
Aug 31 10:49:09 | CX_201: pgfailback = -2 (controller setting)
Aug 31 10:49:09 | CX_201: no_path_retry = 2 (controller setting)
Aug 31 10:49:09 | pg_timeout = NONE (internal default)

Two paths failed by driver, multipahd marked them as failed.

Aug 31 10:49:09 | checker failed path 66:0 in map CX_201
Aug 31 10:49:09 | CX_201: remaining active paths: 1

Checker failed third path

Aug 31 10:49:09 | checker failed path 8:96 in map CX_201
Aug 31 10:49:09 | CX_201: Entering recovery mode: max_retries=2
Aug 31 10:49:09 | CX_201: remaining active paths: 0

Checker failed last path; multipathd entered retry loop.

Aug 31 10:49:10 | CX_201: devmap event #20

We got late event about failed path

Aug 31 10:49:10 | CX_201: discover

Start discovery. Call update_multipath -> setup_multipath ->
update_multipath_strings -> update_multipath_tablle -> disassemble_map.

Now disassemble_map tries to set no_path_retry value from kernel. This
obviously is not going to work as kernel is able remembering only Boolean
(queue/fail), while no_path_retry is arbitrary integer. So no_path_retry is set
to NO_PATH_RETRY_QUEUE from kernel.

Aug 31 10:49:10 | CX_201: rr_weight = 1 (internal default)
Aug 31 10:49:10 | CX_201: pgfailback = -2 (controller setting)

At this point we call set_no_path_retry:

set_no_path_retry(struct multipath *mpp)
{
        mpp->retry_tick = 0;
        mpp->nr_active = pathcount(mpp, PATH_UP) + pathcount(mpp, PATH_GHOST);
        if (mpp->nr_active > 0)
                select_no_path_retry(mpp);

So

1) retry_tick is reset
2) nr_active = 0 (no active path)
3) we do not set no_path_retry from config file because nr_active == 0 => left
with NO_PATH_RETRY_QUEUE.

Aug 31 10:49:10 | pg_timeout = NONE (internal default)

>From now on there is no state changes, so map is hung forever.

Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoPrint log messages when updating tables failed
Hannes Reinecke [Tue, 8 Jan 2013 13:54:04 +0000 (14:54 +0100)]
Print log messages when updating tables failed

Add some logging messages to identify the case of failure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoMake log_pthread more robust
Hannes Reinecke [Tue, 8 Jan 2013 13:54:03 +0000 (14:54 +0100)]
Make log_pthread more robust

We don't need to allocate memory for mutexes, we can just
be using static variables. And valgrind complained about
logqueue flush from shutdown, so don't do this.
The normal shutdown process should be flushing the log
queue anyway.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoDo not call sysfs_get_timeout for non-SCSI devices
Hannes Reinecke [Tue, 8 Jan 2013 13:54:02 +0000 (14:54 +0100)]
Do not call sysfs_get_timeout for non-SCSI devices

Only SCSI devices have a timeout, so there is no point in
trying to set a timeout for other types.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoPath checker should return PATH_DOWN when no path is found
Hannes Reinecke [Tue, 8 Jan 2013 13:54:01 +0000 (14:54 +0100)]
Path checker should return PATH_DOWN when no path is found

If the path checker fails to lookup the path in sysfs it's
already gone, so we should rather return 'PATH_DOWN' here.
Otherwise the path will never marked failed and no failover
will happen.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agolibmultipath: prio keyword ignored for multipath config
Hannes Reinecke [Tue, 8 Jan 2013 13:54:00 +0000 (14:54 +0100)]
libmultipath: prio keyword ignored for multipath config

When specifying the 'prio' keyword in the multipath section
of the configuration file the value is ignored.
Problem is that the 'wwid' value is set only after the call
to select_prio(), so the correct definition couldn't been
found in the config file.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoSwitch off 'queue_if_no_path' before removing maps
Hannes Reinecke [Tue, 8 Jan 2013 13:53:58 +0000 (14:53 +0100)]
Switch off 'queue_if_no_path' before removing maps

Before we try to flush a map we have to switch off the
'queue_if_no_path' setting to flush any outstanding I/O.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoInconsistent string quoting
Hannes Reinecke [Tue, 8 Jan 2013 13:53:57 +0000 (14:53 +0100)]
Inconsistent string quoting

When printing the hardware table strings are quoted twice, and
even numerical values have single quotes. This patch removes
the double quotes and retains single quotes only for string
values.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoCheck return code from pathinfo()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:56 +0000 (14:53 +0100)]
Check return code from pathinfo()

Pathinfo might fail, which indicates that the path is not
available anymore. So check the return value and take
appropriate action.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoIncrease parameter buffer
Hannes Reinecke [Tue, 8 Jan 2013 13:53:55 +0000 (14:53 +0100)]
Increase parameter buffer

Multipath is using an internal static buffer for assembling
device-mapper tables, which might be too small for large setups.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agolibmultipath: error checking in remove_features()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:54 +0000 (14:53 +0100)]
libmultipath: error checking in remove_features()

An error check was missing in remove_features().

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoClarify setting origin in propsel.c
Hannes Reinecke [Tue, 8 Jan 2013 13:53:53 +0000 (14:53 +0100)]
Clarify setting origin in propsel.c

We should differentiate between config file and internal default.
And the controller settings are not defaults.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoPrint out multipath alias for flush_on_last_del messages
Hannes Reinecke [Tue, 8 Jan 2013 13:53:52 +0000 (14:53 +0100)]
Print out multipath alias for flush_on_last_del messages

Added for consistency.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoIncorrect inquiry vendor length in hds prioritizer
Hannes Reinecke [Tue, 8 Jan 2013 13:53:51 +0000 (14:53 +0100)]
Incorrect inquiry vendor length in hds prioritizer

The inquiry vendor length is 8 bytes, but snprintf writes
the given number of bytes _including_ the NULL byte. So
we need to supply a 9 byte buffer here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoValgrind fixes for prioritizer
Hannes Reinecke [Tue, 8 Jan 2013 13:53:50 +0000 (14:53 +0100)]
Valgrind fixes for prioritizer

Declaring an array does not zero out its contents. So we might
be reading random garbage here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoChecker name is not displayed on failure
Hannes Reinecke [Tue, 8 Jan 2013 13:53:49 +0000 (14:53 +0100)]
Checker name is not displayed on failure

If add_checker() isn't able to locate the checker
it won't display the name in free_checker().

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoIntroduce MP_FAST_IO_FAIL_UNSET
Hannes Reinecke [Tue, 8 Jan 2013 13:53:48 +0000 (14:53 +0100)]
Introduce MP_FAST_IO_FAIL_UNSET

For completeness; all other special values are encoded with
defines, so 'unset' should be, too.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoDo not trigger a map reload on priority updates
Hannes Reinecke [Tue, 8 Jan 2013 13:53:47 +0000 (14:53 +0100)]
Do not trigger a map reload on priority updates

update_path_groups() is just there to update the priority groups,
so it should trigger a table reload only if the priority has
indeed changed.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agolibmultipath: Fix typo in mp_prio_handler()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:46 +0000 (14:53 +0100)]
libmultipath: Fix typo in mp_prio_handler()

The mpentry is found in conf->mptable, not conf->hwtable.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoAdd TAGS makefile target
Hannes Reinecke [Tue, 8 Jan 2013 13:53:45 +0000 (14:53 +0100)]
Add TAGS makefile target

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoAccept several whitespaces in bindings file
Hannes Reinecke [Tue, 8 Jan 2013 13:53:44 +0000 (14:53 +0100)]
Accept several whitespaces in bindings file

Prior versions of multipathd would accept several whitespaces
in the bindings file.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoprio: fix merging of prioritizers with different args
Petr Uzel [Tue, 8 Jan 2013 13:53:43 +0000 (14:53 +0100)]
prio: fix merging of prioritizers with different args

Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
8 years agolibmultipath: resource leak in read_value_block()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:42 +0000 (14:53 +0100)]
libmultipath: resource leak in read_value_block()

read_value_block() allocates the vector 'elements', but
doesn't free it on error.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoFixup pathgroup allocation in disassemble_map()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:41 +0000 (14:53 +0100)]
Fixup pathgroup allocation in disassemble_map()

The check for empty path groups in disassemble_map() is not quite
correct; we might end up removing the pathgroup vector even though
there are some entries in it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agoRemove newline from condlog()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:40 +0000 (14:53 +0100)]
Remove newline from condlog()

condlog() already adds a newline to each message.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agolibmultipath: Invalid check for mpp->wwid in dm_addmap()
Hannes Reinecke [Tue, 8 Jan 2013 13:53:39 +0000 (14:53 +0100)]
libmultipath: Invalid check for mpp->wwid in dm_addmap()

mpp->wwid is an array, and so a check against NULL
is wrong; we need to use strlen here.
Found by coverity.

Signed-off-by: Hannes Reinecke <hare@suse.de>
8 years agolibmultipath: fix segfault when vector is null
Mike Christie [Tue, 8 Jan 2013 19:36:34 +0000 (13:36 -0600)]
libmultipath: fix segfault when vector is null

While performing tests that caused paths to get added
and deleted, we hit a segfault. We traced it to the
vector struct being NULL. This patch fixes the problem
by checking for a NULL vector before accessing it.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
8 years agofix extended partition mapping
Phillip Susi [Sun, 6 Jan 2013 02:57:30 +0000 (21:57 -0500)]
fix extended partition mapping

The linux kernel maps the extended partition only
so that LILO can be installed there.  The length is always set
to two sectors to allow this, and most tools know to ignore the
device.  kpartx was mapping the entire extended partition, then
stacking the logical partitions on top of it.  This presented
a device that looked like an entirely separate disk that
contains only the logical partitions.  This patch fixes kpartx
to conform with the normal Linux behavior.

8 years agolibmpathpersist: correct function description in mpath_persist.h and man file
Wang Sheng-Hui [Sat, 5 Jan 2013 19:36:07 +0000 (20:36 +0100)]
libmpathpersist: correct function description in mpath_persist.h and man file

* For mpath_persistent_reserve_out, we have "#define MPATH_PRTPE_EA_RO
  0x06". Correct the description of the function prototype
  mpath_persistent_reserve_out in mpath_persist.h and man file.

* Correct typo in the description for the function prototype
  mpath_persistent_reserve_in in mpath_persist.h

8 years agolibmultipath: fix flush_on_last_del config handlers
Guangyu Sun [Sat, 22 Dec 2012 08:40:59 +0000 (09:40 +0100)]
libmultipath: fix flush_on_last_del config handlers

Now flush_on_last_del cannot be set to FLUSH_DISABLED if its default
value is FLUSH_ENABLED. This patch fixes the issue.

8 years agoiet prioritizer fix
Peter Gervai [Sun, 30 Sep 2012 20:48:37 +0000 (22:48 +0200)]
iet prioritizer fix

- revert the path weight allocation : heavy weight for the
preferred path, light for others
- mention the original author's name
- embed a short usage documentation

8 years agomultipath: fix setting sysfs fc timeout parameters
Benjamin Marzinski [Mon, 20 Aug 2012 22:26:46 +0000 (17:26 -0500)]
multipath: fix setting sysfs fc timeout parameters

Multipath was accidentally trying to write to the directory where
dev_loss_tmo and fast_io_fail_tmo were located instead to the files
themselves. Also, if dev_loss_tmo was unset, it was trying to set it
to 0. Finally, it wasn't correctly checking for errors in setting
the files.  This patch fixes these issues.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: and wwids_file multipath.conf option
Benjamin Marzinski [Mon, 20 Aug 2012 22:19:26 +0000 (17:19 -0500)]
multipath: and wwids_file multipath.conf option

This patch adds a wwids_file multipath.conf option, so that users can
move the wwids file from its default location at /etc/multipath/wwids.
It also corrects the default bindings file location in the multipath.conf
manpage and makes the bindings_file value always print out when you
display the configuration, like is done with the other default values.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: check if a device belongs to multipath
Benjamin Marzinski [Fri, 27 Jul 2012 20:55:24 +0000 (15:55 -0500)]
multipath: check if a device belongs to multipath

This patch adds a new multipath option "-c", which checks if a device
belongs to multipath.  This can be done during the add uevent for the
device, before the multipath device has even been created.  This allows
udev to be able to handle multipath path devices differently.  To do
this multipath now keeps track of the wwids of all previously created
multipath devices in /etc/multipath/wwids. The file creating and
editting code from alias.[ch] has been split out into file.[ch]. When a
device is checked to see if it's a multipath path, it's wwid is
compared against the ones in the wwids file. Also, since the
uid_attribute may not have been added to the udev database entry for
the device if this is called in a udev rule, when using the "-c" option,
get_uid will now also check the process' envirionment variables for the
uid_attribute.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: add followover failback mode
Benjamin Marzinski [Fri, 27 Jul 2012 20:56:19 +0000 (15:56 -0500)]
multipath: add followover failback mode

This patch adds a new failback mode, followover, to deal with multiple
computers accessing the same active/passive storage devices.  In these
cases, if only one node loses access to the primary paths, it will
force a trespass to the secondary paths.  If the nodes are configured
with immediate failback, the other nodes with trespass back to the
primary paths, and the machines will ping-pong the storage. If the
nodes are configured with manual failback, this won't happen. However
when the primary path is restored on the node that lost access to it,
the nodes won't automatically failback to it.  In followover mode, they
will.

Followover mode works by only failing back when a path comes back online
from a pathgroup that previously had no working paths.  For this to
work, the paths need an additional attribute, chkrstate. This is just like
the path state, except it is not updated when the paths state is changed
by the kernel, only when the path checker function sees that the path is
down.  This is necessary because when a trespass occurs, all the outstanding
IO to the previously active paths will fail, and the kernel will mark the
path as down.  But for failback to happen in followover mode, the paths must
actually be down, not just in a ghost state.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: fix cciss device names
Benjamin Marzinski [Fri, 27 Jul 2012 20:57:18 +0000 (15:57 -0500)]
multipath: fix cciss device names

When we're looking for cciss devices in sysfs, they have a "!" not a "/".
If users run multipath on a cciss device using it's devnode name,
/dev/cciss/cXdY, multipath should convert that to the sysfs name.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: remove callout code
Benjamin Marzinski [Fri, 27 Jul 2012 20:54:29 +0000 (15:54 -0500)]
multipath: remove callout code

Since nothing is using the callout code anymore, I've removed it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath-tools: prevent unexpected swapping of underlying LUNs
Jun'ichi Nomura [Fri, 17 Aug 2012 08:40:39 +0000 (17:40 +0900)]
multipath-tools: prevent unexpected swapping of underlying LUNs

When you want to rename a multipath device
but the new alias is already used by other multipath device,
multipath-tools mistakenly reload a table for the original multipath device
to the other multipath device.
That could lead to very bad result, such as I/O error and data corruption.

This patch checks such a condition and gives up renaming with error log.

For example, suppose you have following 'bindings' file:

   # cat /etc/multipath/bindings
   mpatha 212140084abcd0000
   mpathb 212150084abcd0000

and a logical volume 'VG/LV0' on top of mpathb,
which is on top of /dev/sde(8:64) and /dev/sdk(8:160):

   # dmsetup ls --tree
   mpatha (253:1)
    ├─ (8:144)
    └─ (8:48)
   VG-LV0 (253:2)
    └─mpathb (253:0)
       ├─ (8:160)
       └─ (8:64)

Then you decide to swap their names and change the 'bindings' as follows:

   # cat /etc/multipath/bindings
   mpathb 212140084abcd0000
   mpatha 212150084abcd0000

you'll get this after 'service multipathd reload':

   # dmsetup ls --tree
   mpatha (253:1)
    ├─ (8:160)
    └─ (8:64)
   VG-LV0 (253:2)
    └─mpathb (253:0)
       ├─ (8:144)
       └─ (8:48)

Now you suddenly have 'VG/LV0' on top of /dev/sdd(8:48) and /dev/sdj(8:144),
that is obviously wrong and will corrupt data if you write to 'VG/LV0'.

8 years agomultipath: retry the DID_SOFT_ERROR for rdac checker commands
Moger, Babu [Tue, 10 Jul 2012 19:12:13 +0000 (19:12 +0000)]
multipath: retry the DID_SOFT_ERROR for rdac checker commands

Sometimes we have seen immediate path failures for DID_SOFT_ERROR status. Just add a retry as other statuses.
It will basically add 5 retries.

Here are the messages.
Jun  4 17:46:42 ictc-billy kernel: mpt2sas0: log_info(0x31120100): originator(PL), code(0x12), sub_code(0x0100)
Jun  4 17:46:42 ictc-billy kernel: sd 1:0:0:47: [sdn] Unhandled error code
Jun  4 17:46:42 ictc-billy kernel: sd 1:0:0:47: [sdn] Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
Jun  4 17:46:42 ictc-billy kernel: sd 1:0:0:47: [sdn] CDB: Write(10): 2a 00 00 06 e4 a0 00 00 20 00
Jun  4 17:46:42 ictc-billy kernel: device-mapper: multipath: Failing path 8:208.
Jun  4 17:46:42 ictc-billy multipathd: 8:208: mark as failed
Jun  4 17:46:42 ictc-billy multipathd: mpathat: remaining active paths: 1

Signed-off-by: Babu Moger <babu.moger@netapp.com>
8 years agomultipath: fix libudev bug in sysfs_get_tgt_nodename
Benjamin Marzinski [Mon, 11 Jun 2012 21:32:35 +0000 (16:32 -0500)]
multipath: fix libudev bug in sysfs_get_tgt_nodename

In a recent patch, I introduced a bug into sysfs_get_tgt_nodename().
multipath must not unreference the target udevice before it copies the
tgt_nodename to another location, otherwise the value pointer will be
pointing at freed memory.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: remove unnecessary fds from alias code
Benjamin Marzinski [Fri, 8 Jun 2012 17:28:28 +0000 (12:28 -0500)]
multipath: remove unnecessary fds from alias code

Originally, the alias code duped the bindings file fd so that a stream could
be opened on it, and then closed without closing the original fd. Later,
closing the stream was moved to the end of the function, to avoid a locking
bug.  Because of this, there isn't any point to duping the fd.  Also, since
the stream is still opened when the original fd is used, the stream
should be flushed after its done being used.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: libudev cleanup and bugfixes
Benjamin Marzinski [Tue, 5 Jun 2012 23:04:36 +0000 (18:04 -0500)]
multipath: libudev cleanup and bugfixes

get_refwwid wasn't working anymore, since it wasn't setting the path's udevice.
Also, cli_add_path was dereferencing a NULL pointer (pp). Finally, there were
a number of places where udev devices weren't getting dereferenced when they
should have been, causing memory leaks.  This patch cleans these up.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: Fix warnings from stricter compile options.
Benjamin Marzinski [Fri, 25 May 2012 04:57:43 +0000 (23:57 -0500)]
multipath: Fix warnings from stricter compile options.

With stricter compilation options, multipath printed number of
warnings during compilation. Some of them were actual bugs. Others
couldn't cause any problems.  This patch cleans up all the new
warnings.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: Build with standard rpm cflags
Benjamin Marzinski [Fri, 25 May 2012 04:57:42 +0000 (23:57 -0500)]
multipath: Build with standard rpm cflags

This patch makes multipath build with the standard redhat rpm cflags, which
can help catch some code errors.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: Some device configuration changes for NetApp LUNs
Benjamin Marzinski [Wed, 23 May 2012 21:42:20 +0000 (16:42 -0500)]
multipath: Some device configuration changes for NetApp LUNs

NetApp has asked for these configuration changes.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: fix scsi async tur checker corruption
Benjamin Marzinski [Wed, 23 May 2012 21:05:20 +0000 (16:05 -0500)]
multipath: fix scsi async tur checker corruption

Since the tur checker runs asynchronously in its own thread, there is nothing
that keeps a path from being orphaned or deleted before the tur thread has
finished. When this happenes the checker struct gets deleted.  However, the tur
thread might still we writing to that memory.  This can lead to memory
corruption.  This patch adds all of the necessary data to the checker context,
and makes the tur thread only use that. This way, if the checker is deleted
while the thread is still using the context, the thread will clean up the
context itself.

Since the context can only be freed when both the thread and the paths checker
structure have stopped needing it, and these can get get finished with the
context asychronously with respect to each other, the context has a holders
counter, protected by a spinlock, to keep track of the users.  When the
counter drops to zero, the context gets freed.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: Allow user_friendly_names in more config sections
Benjamin Marzinski [Wed, 23 May 2012 20:36:30 +0000 (15:36 -0500)]
multipath: Allow user_friendly_names in more config sections

This patch adds support for setting user_friendly_names in the devices and
multipaths config sections.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: Make sure we store all the hwentry attributes.
Benjamin Marzinski [Wed, 23 May 2012 20:29:05 +0000 (15:29 -0500)]
multipath: Make sure we store all the hwentry attributes.

Not all of the attributes from the hardware table entries were getting stored
when the built-in devices configurations were being setup.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agoMinor typo in multipath.conf.annotated
Cedric Buissart [Wed, 23 May 2012 20:25:45 +0000 (22:25 +0200)]
Minor typo in multipath.conf.annotated

8 years agomultipath: clean up code for stopping the waiter threads
Benjamin Marzinski [Sat, 19 May 2012 06:37:03 +0000 (01:37 -0500)]
multipath: clean up code for stopping the waiter threads

The way multipathd currently stops the waiter threads needs some work.
Right now they are stopped by being sent the SIGUSR1 signal. However their
cleanup code assumes that they are being cancelled, just like all the other
threads are.  There's no reason for them to be so unnecessarily
complicated and different from the other threads

This patch does a couple of things.  First, it removes the mutex from
the event_thread.  This wasn't doing anything. It was designed to protect
the wp->mapname variable, which the waiter threads were checking to see
if they should quit. However, the mutex was only ever being used by the
thread itself, and it clearly didn't need to serialize with itself.  Also,
the function to clear the mapname, signal_waiter(), was set with
pthread_cleanup_push(), which never got called early, since the threads
weren't being cancelled.  Thus, the mapname never got cleared
until the pthreads were about to shut down.

The patch also rips out all the signal stopping code, and just uses
pthread_cancel.  There already are cancellation points in the waiter
thread code. Between the cancellation points, both explicit and implicit,
and the fact that the waiter threads will never be killed except when the
killer is holding the vecs lock, there shouldn't be any place where the
waiter thread can access freed data.

To make sure the waiter thread cleans itself up properly, the dmt
has been moved into the event_thread structure, and is destroyed in
free_waiter() if necessary.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
8 years agomultipath: fix select_no_path_retry for flushing devices.
Benjamin Marzinski [Fri, 18 May 2012 22:33:26 +0000 (17:33 -0500)]
multipath: fix select_no_path_retry for flushing devices.

The select_no_path_retry code was falling through if a flush was
in progress, and so it wasn't honoring flush_on_last_del.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agoFix compilation on older udev versions
Hannes Reinecke [Thu, 19 Apr 2012 12:03:47 +0000 (14:03 +0200)]
Fix compilation on older udev versions

Older udev versions do not export 'udev_monitor_set_receive_buffer_size',
so we need to comment it out on those systems.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoCompilation fix for system not providing OOM_SCORE_ADJ_MIN
Hannes Reinecke [Thu, 19 Apr 2012 12:03:26 +0000 (14:03 +0200)]
Compilation fix for system not providing OOM_SCORE_ADJ_MIN

Newer systems do not provide a definition for OOM_SCORE_ADJ_MIN,
so we need to test against this.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoRemove all references to hand-craftes sysfs code
Hannes Reinecke [Thu, 19 Apr 2012 09:09:06 +0000 (11:09 +0200)]
Remove all references to hand-craftes sysfs code

We've now converted everything to libudev, so we can get rid
of all the variables etc.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoUse udev devices during discovery
Hannes Reinecke [Thu, 19 Apr 2012 09:09:05 +0000 (11:09 +0200)]
Use udev devices during discovery

Remove all hand-crafted sysfs access code and replace it with
libudev functions.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoUse udev attribute instead of getuid_callout()
Hannes Reinecke [Thu, 19 Apr 2012 09:09:04 +0000 (11:09 +0200)]
Use udev attribute instead of getuid_callout()

By the time we're receiving an event udev already figured out
a unique ID. So we can just use that and get rid of the
callout.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agodiscovery: Fixup cciss discovery
Hannes Reinecke [Thu, 19 Apr 2012 09:09:03 +0000 (11:09 +0200)]
discovery: Fixup cciss discovery

We can get the sysfs attributes directly from the parent.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoUse struct udev_device instead of sysdev
Hannes Reinecke [Thu, 19 Apr 2012 09:09:02 +0000 (11:09 +0200)]
Use struct udev_device instead of sysdev

Remove hand-crafted sysdev and use struct udev_device instead.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoRemove stale variable in sysfs_attr_get_value
Hannes Reinecke [Thu, 19 Apr 2012 09:09:01 +0000 (11:09 +0200)]
Remove stale variable in sysfs_attr_get_value

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agouse struct udev_device during discovery
Hannes Reinecke [Thu, 19 Apr 2012 09:09:00 +0000 (11:09 +0200)]
use struct udev_device during discovery

We can save quite some parsing etc. by just using struct udev_device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoUse udev enumeration during discovery
Hannes Reinecke [Thu, 19 Apr 2012 09:08:59 +0000 (11:08 +0200)]
Use udev enumeration during discovery

Instead of scanning /sys/block by hand we should be using enumeration
provided by udev.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoAdd global udev reference pointer to config
Hannes Reinecke [Thu, 19 Apr 2012 09:08:58 +0000 (11:08 +0200)]
Add global udev reference pointer to config

Instead of using a local reference to udev we should be moving it
to the global config structure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agomultipathd: use struct path as argument for event processing
Hannes Reinecke [Thu, 19 Apr 2012 09:08:57 +0000 (11:08 +0200)]
multipathd: use struct path as argument for event processing

ev_add/remove_path should be using struct path as the argument,
this makes transitioning to use libudev easier.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agomultipathd: remove references to sysfs_device
Hannes Reinecke [Thu, 19 Apr 2012 09:08:56 +0000 (11:08 +0200)]
multipathd: remove references to sysfs_device

When processing events we don't need to take a reference to the
sysfs_device; it will be done later on during pathinfo.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agoUse devpath as argument for sysfs functions
Hannes Reinecke [Thu, 19 Apr 2012 09:08:55 +0000 (11:08 +0200)]
Use devpath as argument for sysfs functions

Whenever we pass in a sysfs structure to functions we're only
ever interested in the devpath. So we can as well pass in the
device path directly, without reference to the sysfs structure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
9 years agomultipath: enable getting uevents through libudev
Benjamin Marzinski [Tue, 10 Apr 2012 04:01:54 +0000 (23:01 -0500)]
multipath: enable getting uevents through libudev

udev is removing support for RUN+="socket:..." rules. For now, I've kept
all the existing uevent code, but I've added a new method for getting the
uevent information using libudev that will be tried first.

This version includes more error checking.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agomultipath: lower the log level for rdac TAS messages
Moger, Babu [Fri, 6 Apr 2012 21:49:19 +0000 (21:49 +0000)]
multipath: lower the log level for rdac TAS messages

This patch lowers the log level for rdac TAS related messages.
These calls are expected to fail in cluster configurations due to reservations.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
9 years agomultipath: blacklist all the management Luns by default
Moger, Babu [Wed, 14 Mar 2012 21:20:22 +0000 (21:20 +0000)]
multipath: blacklist all the management Luns by default

This patch adds the blacklisting for all the management luns. Otherwise
user has to manually add blacklisting in multipath.conf for these luns.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
9 years agoFix fast_io_fail capping
Jun'ichi Nomura [Mon, 12 Mar 2012 11:56:52 +0000 (20:56 +0900)]
Fix fast_io_fail capping

Hi Christophe,

fast_io_fail is only meaningful if it is smaller than dev_loss_tmo.
Setting dev_loss_tmo value to fast_io_fail ends up with -EINVAL.
If the fast_io_fail is not configured properly, turning it off
seems to be the right behavior.

MP_FAST_IO_FAIL_OFF is -1, defined in the following patch:
  [PATCH] Fix for setting '0' to fast_io_fail
  http://www.redhat.com/archives/dm-devel/2012-March/msg00047.html

--
Jun'ichi Nomura, NEC Corporation

9 years agoFix for setting '0' to fast_io_fail
Jun'ichi Nomura [Mon, 12 Mar 2012 11:43:55 +0000 (20:43 +0900)]
Fix for setting '0' to fast_io_fail

Hi Christophe,

In kernel, '0' is valid value for fast_io_fail, meaning immediate
termination of ios on rport delete.
However, '0' is treated as 'not-configured' in various places of
multipath-tools and it is not possible to set 0 to fast_io_fail.

Attached patch fixes that by introducing MP_FAST_IO_FAIL_ZERO
as internal representation of zero value.

--
Jun'ichi Nomura, NEC Corporation

9 years agomultipath: Set 'tur' as the default path checker for NetApp LUNs
Martin George [Mon, 12 Mar 2012 08:22:11 +0000 (13:52 +0530)]
multipath: Set 'tur' as the default path checker for NetApp LUNs

In our tests, we've noticed that the 'tur' checker provides
better performance compared to 'directio' primarily because 'tur'
does not use FS-based requests unlike 'directio'. Moreover with
Hannes' recent async tur enhancement, the 'tur' checker is more
efficient now than before.

So we'd prefer using 'tur' as the default path checker for NetApp
LUNs now. The below patch enables the same by updating the
.checker_name in the hwtable for NetApp LUNs.

Signed-off-by: Martin George <marting@netapp.com>
9 years agomultipath-tools: cleanup for all unused-but-set-variable variables in mpathpersist
Chauhan, Vijay [Tue, 6 Mar 2012 15:11:38 +0000 (15:11 +0000)]
multipath-tools: cleanup for all unused-but-set-variable variables in mpathpersist

This patch is a cleanup for all unused-but-set-variable variables
in mpathpersist.

Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
9 years agomultipath-tools: Implementation for hex output (-H) for mpathpersist
Chauhan, Vijay [Tue, 6 Mar 2012 15:10:15 +0000 (15:10 +0000)]
multipath-tools: Implementation for hex output (-H) for mpathpersist

Adding missing implementation for hex output(-H).

Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
9 years agomultipath-tools: Generalizing the vpd 0x83 processing with correct buffer length
Moger, Babu [Wed, 22 Feb 2012 18:09:10 +0000 (18:09 +0000)]
multipath-tools: Generalizing the vpd 0x83 processing with correct buffer length

Right now the buffer length for inquiry vpd 0x83 is hardcoded to 128 bytes.
This can cause problems if the length of all the designation descriptors
exceed 128 bytes. This was causing me issues while configuring my storage
with alua. I have generalized the processing with correct buffer length.
Patch has been tested with NetApp E-series storage.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
9 years agomultipath-tools: fix the bug while processing vpd 0x83 designation descriptors
Moger, Babu [Wed, 22 Feb 2012 18:09:00 +0000 (18:09 +0000)]
multipath-tools: fix the bug while processing vpd 0x83 designation descriptors

This patch fixes the bug while processing the vpd 0x83 designation descriptors.
Removing the buggy check(> sizeof(buf))while loping the descriptors. Sizeof(buf) will
always return 8 (in 64 bit machine). Descriptor length can be more than 8 bytes in
some cases. This was causing problems while configuring my storage with alua.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
9 years agompathpersist build fix
Christophe Varoqui [Sat, 11 Feb 2012 08:33:45 +0000 (09:33 +0100)]
mpathpersist build fix

remove -lsysfs from Makefiles. sysfs.h is provided through
-lmultipath.

9 years agomultipath: another manpage update
Benjamin Marzinski [Fri, 10 Feb 2012 18:18:38 +0000 (12:18 -0600)]
multipath: another manpage update

Missed an option with my last manpage update.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agomultipath: adjust messages
Benjamin Marzinski [Fri, 10 Feb 2012 18:16:50 +0000 (12:16 -0600)]
multipath: adjust messages

Stop the rport_id messages from being dispalyed all the time, and add a message
alerting users when multipath tries to setup a map and fails, or ends up
removing the map.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agokpartx: verify GUID partition entry size
Benjamin Marzinski [Fri, 10 Feb 2012 18:14:39 +0000 (12:14 -0600)]
kpartx: verify GUID partition entry size

This patch pulls in some kernel code to catch a corrupt GUID partition
table with the wrong size.

Signed-off-by: Boris Ranto <branto@redhat.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agomultipath: don't remove map twice
Benjamin Marzinski [Fri, 10 Feb 2012 18:13:12 +0000 (12:13 -0600)]
multipath: don't remove map twice

If setup_mutipath fails, it removes the map itself, so don't try to again.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agomultipath: cleanup dev_loss_tmo issues
Benjamin Marzinski [Fri, 10 Feb 2012 18:11:37 +0000 (12:11 -0600)]
multipath: cleanup dev_loss_tmo issues

There are a couple of issues with the dev_loss_tmo code.  First, the
comparison between fast_io_fail and dev_loss was failing for
fast_io_fail = -1. Second, if fast_io_fail_tmo was set to off, and
dev_loss was greater than 600, dev_loss_tmo would not be set. Finally,
verify_paths was calling sysfs_set_scsi_tmo without ever calling
select_fast_io_fail.  However, this hasn't be causing problems since
setup_map is always called immediately after verify_paths, and it calls
all the select_ functions correctly.  This patch fixes all these.  Now,
if setting dev_loss_tmo fails, and fast_io_fail is set to off, it will
retry will dev_loss_tmo set to 600. Also, the calls that are duplicated
between verify_paths and setup_map have been removed.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agomultipath: fix shutdown crashes
Benjamin Marzinski [Fri, 10 Feb 2012 18:10:11 +0000 (12:10 -0600)]
multipath: fix shutdown crashes

A number of processes don't reach a pthread cancellation point
before they use the pathvec or mpvec vectors, after they've
locked the vecs lock.  This can cause crashes on shutdown, since
these vectors are deallocated.  Also, the log thread accesses a
number of resources which may have been deallocated during shutdown
without holding any locks. This patch avoids these issues by
adding pthread_testcancel() checks after acquiring the vecs lock,
and having the child process make sure the log thread has exitted
before deallocating the resources.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
9 years agompathpersist: Add new utility for managing persistent reservation on dm multipath...
Vijay Chauhan [Thu, 9 Feb 2012 15:00:20 +0000 (10:00 -0500)]
mpathpersist: Add new utility for managing persistent reservation on dm multipath device

Persistent reservation management utility (mpathpersist) allows cluster management software to manage
persistent reservation through mpath device. It processes management request from caller
and hides the management task details. It also handles persistent reservation management of
data path life cycle and state changes.

Signed-off-by: Vijay Chauhan <vijay.chauhan@netapp.com>
9 years ago[kpartx] Don't add 'p' delimiter when you shouldn't
Phillip Susi [Thu, 9 Feb 2012 20:16:21 +0000 (21:16 +0100)]
[kpartx] Don't add 'p' delimiter when you shouldn't

The 'p' delimiter is supposed to be added when the base disk name
ends in a digit.  This decision was based on the name given on the
command line, not the canonical device name, so giving /dev/dm-0
instead of /dev/mapper/foo triggered the digit test and added the
'p'.  Changed test to use the canonical name rather than the given
name.