<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Try 'getfattr -m . -d -e hex' (dot instead of dash) and, of course,
do that as root.<br>
<br>
<div class="moz-cite-prefix">On 12/20/2014 06:02 PM,
<a class="moz-txt-link-abbreviated" href="mailto:tbenzvi@3vgeomatics.com">tbenzvi@3vgeomatics.com</a> wrote:<br>
</div>
<blockquote
cite="mid:20141220190235.b2b02683b6fce9ed61e10e2e9bfae354.9213a98af5.mailapi@email04.secureserver.net"
type="cite">
<div>Hi everyone,</div>
<div> </div>
<div>We have a distributed Gluster volume on five bricks over two
servers (first server running gluster 3.4.2, second server
running gluster 3.5.1, both running Fedora 20)</div>
<div>Starting last week, doing a file listing on the mounted
volume shows many files with the same name appearing twice (and
they are listed with the same inode). Doing a search for these
files, I have found 290,000 of them!!</div>
<div> </div>
<div>If I do a listing of these files on the bricks themselves, it
looks like most are link files (du will show the file on the
first server as 0 bytes, and the sticky bit set). The file is
fine on the second server. Unfortunately, running "getfattr -m -
-e hex -d" on the file shows NO gluster-related attributes and I
believe this is why both files appear in the listing. The files
cannot be read by any programs as it is trying to read the link
file. I assume the metadata became corrupted. This is a
production server so we really need to know:</div>
<div> </div>
<div>1. How did this happen, and how can we prevent it going
forward? There was a server crash a week ago and I believe that
was the cause.</div>
<div>2. How can we heal the Gluster volume/bricks and link files.
If there is some straightforward way of restoring the link file
pointer I can write a script to do it, obviously doing this
manually will be impossible.</div>
<div> </div>
<div>Thanks very much for any and all help - much appreciated!</div>
<div> </div>
<div>Regards,</div>
<div>Tom</div>
<div> </div>
<div> </div>
<div>On Wed, Dec 17, 2014 at 4:07 AM,
<a class="moz-txt-link-rfc2396E" href="mailto:tbenzvi@3vgeomatics.com"><tbenzvi@3vgeomatics.com></a> wrote:</div>
<div>
<div>> Hi everyone, we have noticed some extremely odd
behaviour with our<br>
> distributed Gluster volume where duplicate files (same
name, same or<br>
> different content) are being created and stored on
multiple bricks. The only<br>
> consistent clue is that one of the duplicate files has
the sticky bit set. I<br>
> am hoping someone will be able to shed some light on why
this is happening<br>
> and how we can restore the volume as there appear to be
hundreds of such<br>
> files. I will try to provide as much pertinent
information as I can.<br>
><br>
> We have a 130TB Gluster volume consisting of two 20TB
bricks on server1, and<br>
> three 40TB bricks on a server2 which were added at a
later date (and<br>
> rebalancing was done). The volume is mounted on server1,
and accessed only<br>
> through this server but by many users. Both servers went
down due to power<br>
> loss several days ago after which this problem was first
noticed. We ran a<br>
> rebalance command on the volumes, this has not fixed the
problem.<br>
><br>
><br>
> Gluster volume info:<br>
> Volume Name: safari<br>
> Type: Distribute<br>
> Volume ID: d48d0e6b-4389-4c2c-8fd1-cd2854121eda<br>
> Status: Started<br>
> Number of Bricks: 5<br>
> Transport-type: tcp<br>
> Bricks:<br>
> Brick1: server1:/data/glusterfs/safari/brick00/brick<br>
> Brick2: server1:/data/glusterfs/safari/brick01/brick<br>
> Brick3: server2:/data/glusterfs/safari/brick02/brick<br>
> Brick4: server2:/data/glusterfs/safari/brick03/brick<br>
> Brick5: server2:/data/glusterfs/safari/brick04/brick<br>
><br>
><br>
> Size information:<br>
> /dev/sdc 37T 16T 22T 42% /data/glusterfs/safari/brick02<br>
> /dev/sdd 37T 16T 22T 42% /data/glusterfs/safari/brick03<br>
> /dev/sde 37T 17T 21T 45% /data/glusterfs/safari/brick04<br>
> /dev/md126 11T 7.7T 2.8T 74%
/data/glusterfs/safari/brick00<br>
> /dev/md124 11T 8.0T 2.5T 77%
/data/glusterfs/safari/brick01<br>
> server2:/safari 130T 63T 68T 48% /sar<br>
><br>
><br>
> Example 1:<br>
> -Two files with the same name exist in one directory<br>
> -They have different contents and attributes<br>
> -A file listing on the mounted volume shows the same
inode<br>
> -The newer file has sticky bit set<br>
> -Neither file is corrupted, they can both be viewed by
using the absolute<br>
> path (on the bricks)<br>
><br>
> File listing on the mounted volume<br>
> 13036730497538635177 -rw-rw-r-T 1 jon users 924 Dec 15
10:42 RSLC_tab<br>
> 13036730497538635177 -rw-rw-r-- 1 jon users 418 Mar 18
2013 RSLC_tab<br>
><br>
> Listing of the files on the bricks:<br>
> 8925798411 -rw-rw-r-T+ 2 jon users 924 Dec 15 10:42<br>
>
/data/glusterfs/safari/brick00/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab<br>
> 51541886672 -rw-rw-r--+ 2 1002 users 418 Mar 18 2013<br>
>
/data/glusterfs/safari/brick02/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab<br>
><br>
><br>
> Example 2:<br>
> -Two files with the same name exist in one directory<br>
> -They have the same content and attributes<br>
> -No sticky bit is set when looking at file listing on the
mounted volume<br>
> -Sticky bit is set for one while when looking at file
listing on the bricks<br>
> -Files are corrupted<br>
><br>
> File listing on the mounted volume:<br>
> 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8
2013<br>
> ifg_lr/20130226_20130813.diff.phi.ras<br>
> 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8
2013<br>
> ifg_lr/20130226_20130813.diff.phi.ras<br>
><br>
> Listing of the files on the bricks:<br>
> 17058578 -rw-rw-r-T+ 2 tom users 2393848 Dec 13 17:11<br>
>
/data/glusterfs/safari/brick00/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras<br>
> 57986922129 -rw-rw-r--+ 2 1010 users 2393848 Dec 8 2013<br>
>
/data/glusterfs/safari/brick02/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras<br>
><br>
><br>
> Additionally, only some files in this directory are
duplicated. The<br>
> duplicated files are corrupted (can not be viewed as
Raster images: the<br>
> original file type)<br>
> The files which are not duplicated are not corrupted.<br>
><br>
> File command: (notice duplicate and singleton files)<br>
> ifg_lr/20091021_20100218.diff.phi.ras: Sun raster image
data, 1208 x 1981,<br>
> 8-bit, RGB colormap<br>
> ifg_lr/20091021_20101016.diff.phi.ras: data<br>
> ifg_lr/20091021_20101016.diff.phi.ras: data<br>
> ifg_lr/20091021_20101109.diff.phi.ras: Sun raster image
data, 1208 x 1981,<br>
> 8-bit, RGB colormap<br>
> ifg_lr/20091021_20101203.diff.phi.ras: Sun raster image
data, 1208 x 1981,<br>
> 8-bit, RGB colormap<br>
> ifg_lr/20091021_20101227.diff.phi.ras: Sun raster image
data, 1208 x 1981,<br>
> 8-bit, RGB colormap<br>
> ifg_lr/20091021_20110120.diff.phi.ras: Sun raster image
data, 1208 x 1981,<br>
> 8-bit, RGB colormap<br>
> ifg_lr/20091021_20110213.diff.phi.ras: data<br>
> ifg_lr/20091021_20110213.diff.phi.ras: data<br>
> ifg_lr/20091021_20110309.diff.phi.ras: data<br>
> ifg_lr/20091021_20110309.diff.phi.ras: sticky data<br>
> ifg_lr/20091021_20110402.diff.phi.ras: Sun raster image
data, 1208 x 1981,<br>
> 8-bit, RGB colormap</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</body>
</html>