AFR with Existing Data

From GlusterDocumentation

Overview - Let Gluster Copy Existing Data To the other AFR Brick

AFR is GREAT! But what if you have tons of existing data and you want to just start a brick with an existing volume rather then creating the AFR volume and then copying data into it. The following two emails from the mailing list explain how.

On Thu, Apr 24, 2008 at 5:57 PM,  <SOMEONE@SOMEWHERE> wrote:
> Hi,
>
>  I'm trying to move a large volume of data from local disk to GlusterFS. I
> could just copy it, but copying ~ 1TB of data is slow. So, what I've tried
> to do (with some randomly generated data for a test case) is to specify the
> directory already containing the data as the data source for the underlying
> storage brick.
>
>  I then fire up glusterfsd and glusterfs on the same machine, and I can see
> all the data via the mountpoint.
>
>  On another node, I start glusterfsd and glusterfs, and I can see and read
> the data. But, the data doesn't appear on the underlying data brick on the
> 2nd node after I have done cat * > /dev/null in the mounted directory.
>
>  So it looks like GluserFS isn't causing the data to get copied on reads in
> this scenario.
>
>  Can anyone hazard a guess as to why this might be? I am guessing that it's
> to do with the fact that the xattrs/metaddata have not been initialized by
> glusterfs because the files were added "underneath" rather than via the
> mountpoint. Is there a workaround for this, e.g. by manually setting some
> xattrs on the files (in a hope that this might be faster than copying the
> whole volume)?

Your guess is right, just set xattr "trusted.glusterfs.version" to 3 to the
entire tree structure files/dirs (including the exported directory) and try
find + cat, it should work

Krishna
On Tue, Mar 25, 2008 at 7:58 PM, Krishna Srinivas <krishna@SOMEWHERE> wrote:
> Hi Joey,
>
>  Yes that is the way it has to be done after patch-712 (btw 718 fixes a bug that
>  was introduced in 718 in afr)



sorry, 718 fixes a bug introduced in 712 :)


- Hide quoted text -


>
>  you can set trusted.glusterfs.createtime to string "0" and then set
>  trusted.glusterfs.version to string "3"
>
>  setfattr -n trusted.glusterfs.version -v 3 <file/dir>
>
>  Regards
>  Krishna
>
>
>
>  On Tue, Mar 25, 2008 at 7:21 PM, Joey Novak <joey.novak@SOMEWHERE> wrote:
>  > Hi Amar,
>  >
>  >   We have about 140GB of mail data, and it takes a LONG LONG time to copy it
>  >  over, we want to start an AFR brick with this data, we were thinking about
>  >  doing it by writing a script that manually sets the version and create date
>  >  attributes on the existing data, and then, just adding the second server
>  >  in.  Has anyone else done this?
>  >
>  >     Joey

>  >  Gluster-devel mailing list

Overview - Creating the two bricks yourself.

    • untested** It should be the same as above, BUT, you obviously have to copy all the data to the second brick yourself, and set the xattr's of the data to be identical on both volumes.