<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
The RDMA warnings are not relevant if you don't use RDMA. It's
simply pointing out that it tried to register and it couldn't, which
would be expected if your system doesn't support it.<br>
<br>
<div class="moz-cite-prefix">On 03/23/2015 12:29 AM, Mohammed Rafi K
C wrote:<br>
</div>
<blockquote cite="mid:550FC0C7.4040100@redhat.com" type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<br>
<div class="moz-cite-prefix">On 03/23/2015 11:28 AM, Jonathan
Heese wrote:<br>
</div>
<blockquote
cite="mid:3BBF89B7-2F55-48C8-A93B-CA6BE22AFD12@inetu.net"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div>On Mar 23, 2015, at 1:20 AM, "Mohammed Rafi K C" <<a
moz-do-not-send="true" href="mailto:rkavunga@redhat.com">rkavunga@redhat.com</a>>
wrote:<br>
<br>
</div>
<blockquote type="cite">
<div><br>
<div class="moz-cite-prefix">On 03/21/2015 07:49 PM,
Jonathan Heese wrote:<br>
</div>
<blockquote
cite="mid:9db8a1f4e38b4ba8abc485483ef76696@int-exch6.int.inetu.net"
type="cite">
<div
style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
<p>Mohamed,</p>
<p><br>
</p>
<p>I have completed the steps you suggested (unmount
all, stop the volume, set the config.transport to tcp,
start the volume, mount, etc.), and the behavior has
indeed changed.</p>
<p><br>
</p>
<p>[root@duke ~]# gluster volume info<br>
<br>
Volume Name: gluster_disk<br>
Type: Replicate<br>
Volume ID: 2307a5a8-641e-44f4-8eaf-7cc2b704aafd<br>
Status: Started<br>
Number of Bricks: 1 x 2 = 2<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: duke-ib:/bricks/brick1<br>
Brick2: duchess-ib:/bricks/brick1<br>
Options Reconfigured:<br>
config.transport: tcp</p>
<p><br>
[root@duke ~]# gluster volume status<br>
Status of volume: gluster_disk<br>
Gluster
process
Port Online Pid<br>
------------------------------------------------------------------------------<br>
Brick
duke-ib:/bricks/brick1
49152 Y 16362<br>
Brick
duchess-ib:/bricks/brick1
49152 Y 14155<br>
NFS Server on
localhost 2049
Y 16374<br>
Self-heal Daemon on
localhost N/A Y
16381<br>
NFS Server on
duchess-ib 2049
Y 14167<br>
Self-heal Daemon on
duchess-ib N/A Y
14174<br>
<br>
Task Status of Volume gluster_disk<br>
------------------------------------------------------------------------------<br>
There are no active volume tasks<br>
<br>
</p>
<p>I am no longer seeing the I/O errors during prolonged
periods of write I/O that I was seeing when the
transport was set to rdma. However, I am seeing this
message on both nodes every 3 seconds (almost
exactly):</p>
<p><br>
</p>
<p>==> /var/log/glusterfs/nfs.log <==<br>
[2015-03-21 14:17:40.379719] W
[rdma.c:1076:gf_rdma_cm_event_handler]
0-gluster_disk-client-1: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.10.10.1:1023
peer:10.10.10.2:49152)<br>
</p>
<p><br>
</p>
<p>Is this something to worry about? </p>
</div>
</blockquote>
If you are not using nfs to export the volumes, there is
nothing to worry. <br>
</div>
</blockquote>
<div><br>
</div>
I'm using the native glusterfs FUSE component to mount the
volume locally on both servers -- I assume that you're referring
to the standard NFS protocol stuff, which I'm not using here.
<div><br>
</div>
<div>Incidentally, I would like to keep my logs from filling up
with junk if possible. Is there something I can do to get rid
of these (useless?) error messages?<br>
</div>
</blockquote>
<br>
If i understand correctly, you are getting this enormous log
message from nfs log only, all other logs and everything are fine
now, right ? If that is the case, and you are not at all using nfs
for exporting the volume, as a workaround you can disable nfs for
your volume or cluster. (gluster v set nfs.disable on). This will
turnoff your gluster nfs server, and you will no longer get those
log messages.<br>
<br>
<br>
<blockquote
cite="mid:3BBF89B7-2F55-48C8-A93B-CA6BE22AFD12@inetu.net"
type="cite">
<div>
<div>
<blockquote type="cite">
<div>
<blockquote
cite="mid:9db8a1f4e38b4ba8abc485483ef76696@int-exch6.int.inetu.net"
type="cite">
<div
style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
<p>Any idea why there are rdma pieces in play when
I've set my transport to tcp?</p>
</div>
</blockquote>
<br>
there should not be any piece of rdma,if possible, can
you paste the volfile for nfs server. You can find the
volfile in /var/lib/glusterd/nfs/nfs-server.vol or
/usr/local/var/lib/glusterd/nfs/nfs-server.vol<br>
</div>
</blockquote>
<div><br>
</div>
<div>I will get this for you when I can. Thanks.</div>
</div>
</div>
</blockquote>
<br>
If you can make it, that will be great help to understand the
problem.<br>
<br>
<br>
Rafi KC<br>
<br>
<blockquote
cite="mid:3BBF89B7-2F55-48C8-A93B-CA6BE22AFD12@inetu.net"
type="cite">
<div>
<div>
<div><br>
</div>
<div>Regards,</div>
<div>Jon Heese</div>
<br>
<blockquote type="cite">
<div>Rafi KC<br>
<blockquote
cite="mid:9db8a1f4e38b4ba8abc485483ef76696@int-exch6.int.inetu.net"
type="cite">
<div
style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
<p>The actual I/O appears to be handled properly and
I've seen no further errors in the testing I've
done so far.</p>
<p><br>
</p>
<p>Thanks.<br>
</p>
<p><br>
</p>
<p>Regards,</p>
<p>Jon Heese</p>
<p><br>
</p>
<div style="color: rgb(40, 40, 40);" dir="auto">
<hr tabindex="-1" style="display:inline-block;
width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font
style="font-size:11pt" color="#000000"
face="Calibri, sans-serif"><b>From:</b> <a
moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:gluster-users-bounces@gluster.org">
gluster-users-bounces@gluster.org</a> <a
moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:gluster-users-bounces@gluster.org">
<gluster-users-bounces@gluster.org></a>
on behalf of Jonathan Heese <a
moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
href="mailto:jheese@inetu.net">
<jheese@inetu.net></a><br>
<b>Sent:</b> Friday, March 20, 2015 7:04 AM<br>
<b>To:</b> Mohammed Rafi K C<br>
<b>Cc:</b> gluster-users<br>
<b>Subject:</b> Re: [Gluster-users] I/O error
on replicated volume</font>
<div> </div>
</div>
<div>
<div>Mohammed,</div>
<div><br>
</div>
<div>Thanks very much for the reply. I will try
that and report back.<br>
<br>
Regards,
<div>Jon Heese</div>
</div>
<div><br>
On Mar 20, 2015, at 3:26 AM, "Mohammed Rafi K
C" <<a moz-do-not-send="true"
href="mailto:rkavunga@redhat.com">rkavunga@redhat.com</a>>
wrote:<br>
<br>
</div>
<blockquote type="cite">
<div><br>
<div class="moz-cite-prefix">On 03/19/2015
10:16 PM, Jonathan Heese wrote:<br>
</div>
<blockquote type="cite">
<style>
<!--
@font-face
        {font-family:"Cambria Math"}
@font-face
        {font-family:Calibri}
@font-face
        {font-family:"Segoe UI"}
@font-face
        {font-family:Consolas}
@font-face
        {font-family:Georgia}
@font-face
        {font-family:o365IconsIE8}
@font-face
        {font-family:o365IconsMouse}
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
a:link, span.MsoHyperlink
        {color:#0563C1;
        text-decoration:underline}
a:visited, span.MsoHyperlinkFollowed
        {color:#954F72;
        text-decoration:underline}
pre
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";
        color:black}
span.HTMLPreformattedChar
        {font-family:Consolas;
        color:black}
p.ms-cui-menu, li.ms-cui-menu, div.ms-cui-menu
        {margin:0in;
        margin-bottom:.0001pt;
        background:white;
        font-size:10.0pt;
        font-family:"Segoe UI",sans-serif;
        color:#333333}
p.ms-cui-menusection-title, li.ms-cui-menusection-title, div.ms-cui-menusection-title
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-ctl, li.ms-cui-ctl, div.ms-cui-ctl
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:#333333}
p.ms-cui-ctl-on, li.ms-cui-ctl-on, div.ms-cui-ctl-on
        {margin:0in;
        margin-bottom:.0001pt;
        background:#DFEDFA;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-img-cont-float, li.ms-cui-img-cont-float, div.ms-cui-img-cont-float
        {margin-top:1.5pt;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-smenu-inner, li.ms-cui-smenu-inner, div.ms-cui-smenu-inner
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-owa-paste-option-icon, li.ms-owa-paste-option-icon, div.ms-owa-paste-option-icon
        {margin-top:1.5pt;
        margin-right:3.0pt;
        margin-bottom:0in;
        margin-left:3.0pt;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black;
        vertical-align:sub}
p.ms-rtepasteflyout-option, li.ms-rtepasteflyout-option, div.ms-rtepasteflyout-option
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-menusection, li.ms-cui-menusection, div.ms-cui-menusection
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf, li.wf, div.wf
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-family-owa, li.wf-family-owa, div.wf-family-owa
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:o365IconsMouse;
        color:black}
p.msochpdefault, li.msochpdefault, div.msochpdefault
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.wf-owa-play-large, li.wf-owa-play-large, div.wf-owa-play-large
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-size-play-large, li.wf-size-play-large, div.wf-size-play-large
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-family-owa1, li.wf-family-owa1, div.wf-family-owa1
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:o365IconsIE8;
        color:black}
p.wf-owa-play-large1, li.wf-owa-play-large1, div.wf-owa-play-large1
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-owa-play-large2, li.wf-owa-play-large2, div.wf-owa-play-large2
        {margin:0in;
        margin-bottom:.0001pt;
        text-align:center;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-size-play-large1, li.wf-size-play-large1, div.wf-size-play-large1
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-size-play-large2, li.wf-size-play-large2, div.wf-size-play-large2
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-family-owa2, li.wf-family-owa2, div.wf-family-owa2
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:o365IconsIE8;
        color:black}
p.wf-owa-play-large3, li.wf-owa-play-large3, div.wf-owa-play-large3
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-owa-play-large4, li.wf-owa-play-large4, div.wf-owa-play-large4
        {margin:0in;
        margin-bottom:.0001pt;
        text-align:center;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-size-play-large3, li.wf-size-play-large3, div.wf-size-play-large3
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-size-play-large4, li.wf-size-play-large4, div.wf-size-play-large4
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
span.emailstyle17
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.EmailStyle45
        {font-family:"Calibri",sans-serif;
        color:#1F497D}
span.EmailStyle46
        {font-family:"Calibri",sans-serif;
        color:#1F497D}
span.EmailStyle47
        {font-family:"Calibri",sans-serif;
        color:windowtext}
.MsoChpDefault
        {font-size:10.0pt}
@page WordSection1
        {margin:1.0in 1.0in 1.0in 1.0in}
-->
</style>
<div class="WordSection1">
<p class="MsoNormal"><a
moz-do-not-send="true"
name="_MailEndCompose"><span
style="color:#1F497D">Hello all,</span></a></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Does anyone
else have any further suggestions
for troubleshooting this?</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">To sum up: I
have a 2 node 2 brick replicated
volume, which holds a handful of
iSCSI image files which are mounted
and served up by tgtd (CentOS 6) to
a handful of devices on a dedicated
iSCSI network. The most important
iSCSI clients (initiators) are four
VMware ESXi 5.5 hosts that use the
iSCSI volumes as backing for their
datastores for virtual machine
storage.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">After a few
minutes of sustained writing to the
volume, I am seeing a massive flood
(over 1500 per second at times) of
this error in
/var/log/glusterfs/mnt-gluster-disk.log:</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
02:24:07.582801] W
[fuse-bridge.c:2242:fuse_writev_cbk]
0-glusterfs-fuse: 635358: WRITE
=> -1 (Input/output error)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">When this
happens, the ESXi box fails its
write operation and returns an error
to the effect of “Unable to write
data to datastore”. I don’t see
anything else in the supporting logs
to explain the root cause of the i/o
errors.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Any and all
suggestions are appreciated.
Thanks.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
</div>
</blockquote>
<br>
From the mount logs, i assume that your
volume transport type is rdma. There are
some known issues for rdma in 3.5.3, and the
patch for to address those issues are
already send to upstream [1]. From the logs,
I'm not sure and it is hard to tell you
whether this problem is something related to
rdma transport or not. To make sure that the
tcp transport is works well in this
scenario, if possible can you try to
reproduce the same using tcp type volumes.
You can change the transport type of volume
by doing the following step ( not
recommended in normal use case).<br>
<br>
1) unmount every client<br>
2) stop the volume<br>
3) run gluster volume set volname
config.transport tcp<br>
4) start the volume again<br>
5) mount the clients<br>
<br>
[1] : <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://goo.gl/2PTL61">
http://goo.gl/2PTL61</a><br>
<br>
Regards<br>
Rafi KC<br>
<br>
<blockquote type="cite">
<div class="WordSection1">
<div>
<p class="MsoNormal" style=""><i><span
style="font-size:16.0pt;
font-family:"Georgia",serif;
color:#0F5789">Jon Heese</span></i><span
style=""><br>
</span><i><span
style="color:#333333">Systems
Engineer</span></i><span
style=""><br>
</span><b><span
style="color:#333333">INetU
Managed Hosting</span></b><span
style=""><br>
</span><span style="color:#333333">P:
610.266.7441 x 261</span><span
style=""><br>
</span><span style="color:#333333">F:
610.266.7434</span><span style=""><br>
</span><a moz-do-not-send="true"
href="https://www.inetu.net/"><span
style="color:blue">www.inetu.net</span></a><span
style=""></span></p>
<p class="MsoNormal"><i><span
style="font-size:8.0pt;
color:#333333">** This message
contains confidential
information, which also may be
privileged, and is intended only
for the person(s) addressed
above. Any unauthorized use,
distribution, copying or
disclosure of confidential
and/or privileged information is
strictly prohibited. If you have
received this communication in
error, please erase all copies
of the message and its
attachments and notify the
sender immediately via reply
e-mail. **</span></i><span
style="color:#1F497D"></span></p>
</div>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<div>
<div style="border:none;
border-top:solid #E1E1E1 1.0pt;
padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext">
Jonathan Heese <br>
<b>Sent:</b> Tuesday, March 17,
2015 12:36 PM<br>
<b>To:</b> 'Ravishankar N'; <a
moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="mailto:gluster-users@gluster.org"> gluster-users@gluster.org</a><br>
<b>Subject:</b> RE:
[Gluster-users] I/O error on
replicated volume</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<p class="MsoNormal"><span
style="color:#1F497D">Ravi,</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">The last lines
in the mount log before the massive
vomit of I/O errors are from 22
minutes prior, and seem innocuous to
me:</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:07.126340] E
[client-handshake.c:1760:client_query_portmap_cbk]
0-gluster_disk-client-0: failed to
get the port number for remote
subvolume. Please run 'gluster
volume status' on server to see if
brick process is running.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:07.126587] W
[rdma.c:4273:gf_rdma_disconnect]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
[0x7fd9c557bccf]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7fd9c557a995]
(-->/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)
[0x7fd9c0d8fb9a])))
0-gluster_disk-client-0: disconnect
called (peer:10.10.10.1:24008)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:07.126687] E
[client-handshake.c:1760:client_query_portmap_cbk]
0-gluster_disk-client-1: failed to
get the port number for remote
subvolume. Please run 'gluster
volume status' on server to see if
brick process is running.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:07.126737] W
[rdma.c:4273:gf_rdma_disconnect]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
[0x7fd9c557bccf]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7fd9c557a995]
(-->/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)
[0x7fd9c0d8fb9a])))
0-gluster_disk-client-1: disconnect
called (peer:10.10.10.2:24008)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.730165] I
[rpc-clnt.c:1729:rpc_clnt_reconfig]
0-gluster_disk-client-0: changing
port to 49152 (from 0)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.730276] W
[rdma.c:4273:gf_rdma_disconnect]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
[0x7fd9c557bccf]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7fd9c557a995]
(-->/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)
[0x7fd9c0d8fb9a])))
0-gluster_disk-client-0: disconnect
called (peer:10.10.10.1:24008)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.739500] I
[rpc-clnt.c:1729:rpc_clnt_reconfig]
0-gluster_disk-client-1: changing
port to 49152 (from 0)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.739560] W
[rdma.c:4273:gf_rdma_disconnect]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
[0x7fd9c557bccf]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7fd9c557a995]
(-->/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)
[0x7fd9c0d8fb9a])))
0-gluster_disk-client-1: disconnect
called (peer:10.10.10.2:24008)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.741883] I
[client-handshake.c:1677:select_server_supported_programs]
0-gluster_disk-client-0: Using
Program GlusterFS 3.3, Num
(1298437), Version (330)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.744524] I
[client-handshake.c:1462:client_setvolume_cbk]
0-gluster_disk-client-0: Connected
to 10.10.10.1:49152, attached to
remote volume '/bricks/brick1'.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.744537] I
[client-handshake.c:1474:client_setvolume_cbk]
0-gluster_disk-client-0: Server and
Client lk-version numbers are not
same, reopening the fds</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.744566] I
[afr-common.c:4267:afr_notify]
0-gluster_disk-replicate-0:
Subvolume 'gluster_disk-client-0'
came back up; going online.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.744627] I
[client-handshake.c:450:client_set_lk_version_cbk]
0-gluster_disk-client-0: Server lk
version = 1</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.753037] I
[client-handshake.c:1677:select_server_supported_programs]
0-gluster_disk-client-1: Using
Program GlusterFS 3.3, Num
(1298437), Version (330)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.755657] I
[client-handshake.c:1462:client_setvolume_cbk]
0-gluster_disk-client-1: Connected
to 10.10.10.2:49152, attached to
remote volume '/bricks/brick1'.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.755676] I
[client-handshake.c:1474:client_setvolume_cbk]
0-gluster_disk-client-1: Server and
Client lk-version numbers are not
same, reopening the fds</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.761945] I
[fuse-bridge.c:5016:fuse_graph_setup]
0-fuse: switched to graph 0</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[2015-03-16
01:37:10.762144] I
[client-handshake.c:450:client_set_lk_version_cbk]
0-gluster_disk-client-1: Server lk
version = 1</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[<b>2015-03-16
01:37:10.762279</b>] I
[fuse-bridge.c:3953:fuse_init]
0-glusterfs-fuse: FUSE inited with
protocol versions: glusterfs 7.22
kernel 7.14</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[<b>2015-03-16
01:59:26.098670</b>] W
[fuse-bridge.c:2242:fuse_writev_cbk]
0-glusterfs-fuse: 292084: WRITE
=> -1 (Input/output error)</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">…</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">I’ve seen no
indication of split-brain on any
files at any point in this (ever
since downdating from 3.6.2 to
3.5.3, which is when this particular
issue started):</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">[root@duke
gfapi-module-for-linux-target-driver-]#
gluster v heal gluster_disk info</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Brick
duke.jonheese.local:/bricks/brick1/</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Number of
entries: 0</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Brick
duchess.jonheese.local:/bricks/brick1/</span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Number of
entries: 0</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<p class="MsoNormal"><span
style="color:#1F497D">Thanks.</span></p>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<div>
<p class="MsoNormal" style=""><i><span
style="font-size:16.0pt;
font-family:"Georgia",serif;
color:#0F5789">Jon Heese</span></i><span
style=""><br>
</span><i><span
style="color:#333333">Systems
Engineer</span></i><span
style=""><br>
</span><b><span
style="color:#333333">INetU
Managed Hosting</span></b><span
style=""><br>
</span><span style="color:#333333">P:
610.266.7441 x 261</span><span
style=""><br>
</span><span style="color:#333333">F:
610.266.7434</span><span style=""><br>
</span><a moz-do-not-send="true"
href="https://www.inetu.net/"><span
style="color:blue">www.inetu.net</span></a><span
style=""></span></p>
<p class="MsoNormal"><i><span
style="font-size:8.0pt;
color:#333333">** This message
contains confidential
information, which also may be
privileged, and is intended only
for the person(s) addressed
above. Any unauthorized use,
distribution, copying or
disclosure of confidential
and/or privileged information is
strictly prohibited. If you have
received this communication in
error, please erase all copies
of the message and its
attachments and notify the
sender immediately via reply
e-mail. **</span></i><span
style="color:#1F497D"></span></p>
</div>
<p class="MsoNormal"><span
style="color:#1F497D"> </span></p>
<div>
<div style="border:none;
border-top:solid #E1E1E1 1.0pt;
padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="color:windowtext">From:</span></b><span
style="color:windowtext">
Ravishankar N [</span><a
moz-do-not-send="true"
href="mailto:ravishankar@redhat.com">mailto:ravishankar@redhat.com</a><span
style="color:windowtext">] <br>
<b>Sent:</b> Tuesday, March 17,
2015 12:35 AM<br>
<b>To:</b> Jonathan Heese; </span><a
moz-do-not-send="true"
href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><span
style="color:windowtext"><br>
<b>Subject:</b> Re:
[Gluster-users] I/O error on
replicated volume</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<p class="MsoNormal"><span
style="font-size:12.0pt"> </span></p>
<div>
<p class="MsoNormal">On 03/17/2015
02:14 AM, Jonathan Heese wrote:</p>
</div>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"
style="background:white"><span
style="font-size:12.0pt">Hello,<br>
<br>
So I resolved my previous
issue with split-brains and
the lack of self-healing by
dropping my installed
glusterfs* packages from 3.6.2
to 3.5.3, but now I've picked
up a new issue, which actually
makes normal use of the volume
practically impossible.<br>
<br>
A little background for those
not already paying close
attention:<br>
I have a 2 node 2 brick
replicating volume whose
purpose in life is to hold
iSCSI target files, primarily
for use to provide datastores
to a VMware ESXi cluster. The
plan is to put a handful of
image files on the Gluster
volume, mount them locally on
both Gluster nodes, and run
tgtd on both, pointed to the
image files on the mounted
gluster volume. Then the ESXi
boxes will use multipath
(active/passive) iSCSI to
connect to the nodes, with
automatic failover in case of
planned or unplanned downtime
of the Gluster nodes.<br>
<br>
In my most recent round of
testing with 3.5.3, I'm seeing
a massive failure to write
data to the volume after about
5-10 minutes, so I've
simplified the scenario a bit
(to minimize the variables)
to: both Gluster nodes up,
only one node (duke) mounted
and running tgtd, and just
regular (single path) iSCSI
from a single ESXi server.<br>
<br>
About 5-10 minutes into
migration a VM onto the test
datastore, /var/log/messages
on duke gets blasted with a
ton of messages exactly like
this:</span></p>
<p class="MsoNormal"
style="background:white">Mar 15
22:24:06 duke tgtd:
bs_rdwr_request(180) io error
0x1781e00 2a -1 512 22971904,
Input/output error</p>
<p class="MsoNormal"
style="background:white"> </p>
<p class="MsoNormal"
style="background:white">And
/var/log/glusterfs/mnt-gluster_disk.log
gets blased with a ton of
messages exactly like this:</p>
<p class="MsoNormal"
style="background:white">[2015-03-16
02:24:07.572279] W
[fuse-bridge.c:2242:fuse_writev_cbk]
0-glusterfs-fuse: 635299: WRITE
=> -1 (Input/output error)</p>
<p class="MsoNormal"
style="background:white"> </p>
</div>
</div>
</blockquote>
<p class="MsoNormal"
style="margin-bottom:12.0pt"><span
style=""><br>
Are there any messages in the mount
log from AFR about split-brain just
before the above line appears?<br>
Does `gluster v heal <VOLNAME>
info` show any files? Performing I/O
on files that are in split-brain
fail with EIO.<br>
<br>
-Ravi<br>
<br>
</span></p>
<blockquote style="margin-top:5.0pt;
margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"
style="background:white">And the
write operation from VMware's
side fails as soon as these
messages start.</p>
<p class="MsoNormal"
style="background:white"> </p>
<p class="MsoNormal"
style="background:white">I don't
see any other errors (in the log
files I know of) indicating the
root cause of these i/o errors.
I'm sure that this is not enough
information to tell what's going
on, but can anyone help me
figure out what to look at next
to figure this out?</p>
<p class="MsoNormal"
style="background:white"> </p>
<p class="MsoNormal"
style="background:white">I've
also considered using Dan
Lambright's libgfapi gluster
module for tgtd (or something
similar) to avoid going through
FUSE, but I'm not sure whether
that would be irrelevant to this
problem, since I'm not 100% sure
if it lies in FUSE or elsewhere.</p>
<p class="MsoNormal"
style="background:white"> </p>
<p class="MsoNormal"
style="background:white">Thanks!</p>
<p class="MsoNormal"
style="background:white"> </p>
<p class="MsoNormal"
style="background:white"><i><span
style="font-size:16.0pt;
font-family:"Georgia",serif;
color:#0F5789">Jon Heese</span></i><span
style=""><br>
</span><i><span
style="color:#333333">Systems
Engineer</span></i><span
style=""><br>
</span><b><span
style="color:#333333">INetU
Managed Hosting</span></b><span
style=""><br>
</span><span
style="color:#333333">P:
610.266.7441 x 261</span><span
style=""><br>
</span><span
style="color:#333333">F:
610.266.7434</span><span
style=""><br>
</span><a moz-do-not-send="true"
href="https://www.inetu.net/"><span
style="color:blue">www.inetu.net</span></a></p>
<p class="MsoNormal"
style="background:white"><i><span
style="font-size:8.0pt;
color:#333333">** This
message contains
confidential information,
which also may be
privileged, and is intended
only for the person(s)
addressed above. Any
unauthorized use,
distribution, copying or
disclosure of confidential
and/or privileged
information is strictly
prohibited. If you have
received this communication
in error, please erase all
copies of the message and
its attachments and notify
the sender immediately via
reply e-mail. **</span></i></p>
<p class="MsoNormal"
style="background:white"> </p>
</div>
</div>
<p class="MsoNormal"
style="margin-bottom:12.0pt"><span
style=""><br>
<br>
</span></p>
<pre>_______________________________________________</pre>
<pre>Gluster-users mailing list</pre>
<pre><a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a></pre>
<pre><a moz-do-not-send="true" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<p class="MsoNormal"><span style=""> </span></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre>_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</body>
</html>