libmultipath: timeout on unresponsive tur thread
authorBenjamin Marzinski <bmarzins@redhat.com>
Wed, 10 Oct 2018 18:01:13 +0000 (13:01 -0500)
committerChristophe Varoqui <christophe.varoqui@opensvc.com>
Fri, 12 Oct 2018 07:38:28 +0000 (09:38 +0200)
If the tur checker thread has been cancelled but isn't responding,
timeout instead of doing a sync check.  This will keep one bad
device from impacting all of multipathd.

Fixes: 455242d ("libmultipath: fix tur checker timeout")
Cc: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
libmultipath/checkers/tur.c

index 86c0cdc..b2a2170 100644 (file)
@@ -305,10 +305,10 @@ int libcheck_check(struct checker * c)
        } else {
                if (uatomic_read(&ct->holders) > 1) {
                        /* The thread has been cancelled but hasn't
-                        * quilt. Fail back to synchronous mode */
-                       condlog(3, "%d:%d : tur checker failing back to sync",
+                        * quit. exit with timeout. */
+                       condlog(3, "%d:%d : tur thread not responding",
                                major(ct->devt), minor(ct->devt));
-                       return tur_check(c->fd, c->timeout, c->message);
+                       return PATH_TIMEOUT;
                }
                /* Start new TUR checker */
                pthread_mutex_lock(&ct->lock);