<html><head><meta http-equiv="content-type" content="text/html; charset=UTF-8"><style>body { line-height: 1.5; }blockquote { margin-top: 0px; margin-bottom: 0px; margin-left: 0.5em; }body { font-size: 10.5pt; font-family: 'Microsoft YaHei UI'; color: rgb(0, 0, 0); line-height: 1.5; }</style></head><body>
<div><span></span>Hi Susant,</div><div><br></div><div>You are right, the rebalance process itself is normal now. But the writing brick keeps increasing during rebalancing. Current task has been running for 16 hours, here is the top info.</div><div><br></div><div>===================== top ===========================</div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">top&nbsp;-&nbsp;08:58:27&nbsp;up&nbsp;3&nbsp;days,&nbsp;12:08,&nbsp;&nbsp;1&nbsp;user,&nbsp;&nbsp;load&nbsp;average:&nbsp;1.33,&nbsp;1.18,&nbsp;1.21<br>Tasks:&nbsp;173&nbsp;total,&nbsp;&nbsp;&nbsp;1&nbsp;running,&nbsp;172&nbsp;sleeping,&nbsp;&nbsp;&nbsp;0&nbsp;stopped,&nbsp;&nbsp;&nbsp;0&nbsp;zombie<br>Cpu(s):&nbsp;13.0%us,&nbsp;16.9%sy,&nbsp;&nbsp;0.0%ni,&nbsp;65.7%id,&nbsp;&nbsp;2.7%wa,&nbsp;&nbsp;0.0%hi,&nbsp;&nbsp;1.8%si,&nbsp;&nbsp;0.0%st<br>Mem:&nbsp;&nbsp;&nbsp;8060900k&nbsp;total,&nbsp;&nbsp;7923204k&nbsp;used,&nbsp;&nbsp;&nbsp;137696k&nbsp;free,&nbsp;&nbsp;4528380k&nbsp;buffers<br>Swap:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0k&nbsp;total,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0k&nbsp;used,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0k&nbsp;free,&nbsp;&nbsp;&nbsp;393444k&nbsp;cached<br><br>&nbsp;&nbsp;PID&nbsp;USER&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;PR&nbsp;&nbsp;NI&nbsp;&nbsp;VIRT&nbsp;&nbsp;RES&nbsp;&nbsp;SHR&nbsp;S&nbsp;%CPU&nbsp;%MEM&nbsp;&nbsp;&nbsp;&nbsp;TIME+&nbsp;&nbsp;COMMAND<br>&nbsp;8555&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;950m&nbsp;143m&nbsp;1728&nbsp;S&nbsp;154.7&nbsp;&nbsp;1.8&nbsp;875:01.07&nbsp;glusterfs<br>&nbsp;8479&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;1284m&nbsp;139m&nbsp;1892&nbsp;S&nbsp;69.8&nbsp;&nbsp;1.8&nbsp;443:25.88&nbsp;glusterfsd<br>&nbsp;8497&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;2628m&nbsp;1.8g&nbsp;1892&nbsp;S&nbsp;68.2&nbsp;23.0&nbsp;485:31.42&nbsp;glusterfsd<br>&nbsp;&nbsp;874&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;S&nbsp;&nbsp;2.3&nbsp;&nbsp;0.0&nbsp;&nbsp;65:34.68&nbsp;jbd2/vdb1-8<br>&nbsp;&nbsp;&nbsp;58&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;S&nbsp;&nbsp;0.7&nbsp;&nbsp;0.0&nbsp;&nbsp;44:44.37&nbsp;kblockd/0<br>&nbsp;&nbsp;&nbsp;99&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;S&nbsp;&nbsp;0.7&nbsp;&nbsp;0.0&nbsp;&nbsp;39:17.63&nbsp;kswapd0<br>&nbsp;&nbsp;&nbsp;39&nbsp;root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;S&nbsp;&nbsp;0.3&nbsp;&nbsp;0.0&nbsp;&nbsp;&nbsp;0:16.90&nbsp;events/4<br></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">=====================================================</span></div><div><span style="font-family: ''; font-size: 10.5pt; line-height: 1.5; background-color: window;">As you can see, the PID 8497 takes 1.8g mem now.&nbsp;</span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><br></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">I have taken some state dumps. Later dumps are much bigger than the earlier.</span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">================ ls -lh /var/run/gluster/*dump* ================</span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">-rw-------&nbsp;1&nbsp;root&nbsp;root&nbsp;4.1M&nbsp;Dec&nbsp;17&nbsp;17:52&nbsp;mnt-b1-brick.8497.dump.1450345948<br>-rw-------&nbsp;1&nbsp;root&nbsp;root&nbsp;292M&nbsp;Dec&nbsp;18&nbsp;09:08&nbsp;mnt-b1-brick.8497.dump.1450400909<br>-rw-------&nbsp;1&nbsp;root&nbsp;root&nbsp;297M&nbsp;Dec&nbsp;18&nbsp;09:15&nbsp;mnt-b1-brick.8497.dump.1450401273<br></span></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">=====================================================</span></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><br></span></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">You can download these state dumps (gziped) from this url:</span></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'="">http://pan.baidu.com/s/1jHuZCMU</span></span></span></div><div><span style="font-family: &quot;" microsoft="" yahei="" ui'";="" font-size:="" 14px;="" color:="" rgb(0,="" 0,="" 0);="" background-color:="" rgba(0,="" font-weight:="" normal;="" font-style:="" normal;text-decoration:="" none;'=""><br></span></div>
<div><br></div><hr style="width: 210px; height: 1px;" color="#b5c4df" size="1" align="left">
<div><span><div style="MARGIN: 10px; FONT-FAMILY: verdana; FONT-SIZE: 10pt"><div>PuYun</div></div></span></div>
<blockquote style="margin-top: 0px; margin-bottom: 0px; margin-left: 0.5em;"><div>&nbsp;</div><div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm"><div style="PADDING-RIGHT: 8px; PADDING-LEFT: 8px; FONT-SIZE: 12px;FONT-FAMILY:tahoma;COLOR:#000000; BACKGROUND: #efefef; PADDING-BOTTOM: 8px; PADDING-TOP: 8px"><div><b>From:</b>&nbsp;<a href="mailto:spalai@redhat.com">Susant Palai</a></div><div><b>Date:</b>&nbsp;2015-12-17&nbsp;20:23</div><div><b>To:</b>&nbsp;<a href="mailto:cloudor@126.com">PuYun</a></div><div><b>CC:</b>&nbsp;<a href="mailto:gluster-users@gluster.org">gluster-users</a></div><div><b>Subject:</b>&nbsp;Re: [Gluster-users] How to diagnose volume rebalance failure?</div></div></div><div><div>Ok from your reply rebalance seems to be fine. </div>
<div>So what you can do is check whether the mem-usage of brick process keeps increasing constantly. If that is the case take multiple state-dumps intermittently.</div>
<div>&nbsp;</div>
<div>Regards,</div>
<div>Susant </div>
<div>&nbsp;</div>
<div>----- Original Message -----</div>
<div>From: "PuYun" &lt;cloudor@126.com&gt;</div>
<div>To: "gluster-users" &lt;gluster-users@gluster.org&gt;</div>
<div>Cc: "gluster-users" &lt;gluster-users@gluster.org&gt;</div>
<div>Sent: Thursday, 17 December, 2015 3:57:12 PM</div>
<div>Subject: Re: [Gluster-users] How to diagnose volume rebalance failure?</div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>Hi Susant, </div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>Thank you for your instructions. I'll do that. </div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>My volume contains more than 2 million end sub directories. Most of the end sub directories contains 10~30 small files. Current total size is about 900G. Two bricks, each one is 1T. Current ram size is 8G. </div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>Previously I saw 3 processes, one is glusterfs for rebalance and 2 glusterfsd for bricks. Only 1 glusterfsd occupied very large mem and it is related to the newly added brick. The other 2 processes seems normal. If that happens again, I will send you the state dump. </div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>Thank you. </div>
<div>&nbsp;</div>
<div>PuYun </div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>&nbsp;</div>
<div>&nbsp;</div>
</div></blockquote>
</body></html>