Git submodules are actually a very beautiful thing. You might prefer the word powerful or elegant, but that’s not the point. The downside is that they are sometimes misused, so as always, use with care. I’ve used them in projects like puppet-gluster, oh-my-vagrant, and others. If you’re not familiar with them, do a bit of reading and come back later, I’ll wait.
I recently did some work packaging Oh-My-Vagrant as RPM’s. My primary goal was to make sure the entire process was automatic, as I have no patience for manually building RPM’s. Any good packager knows that the pre-requisite for building a SRPM is a source tarball, and I wanted to build those automatically too.
Simply running a tar -cf
on my source directory wouldn’t work, because I only want to include files that are stored in git. Thankfully, git comes with a tool called git archive
, which does exactly that! No scary tar commands required:
Here’s how you might run it:
$ git archive --prefix=some-project/ -o output.tar.bz2 HEAD
Let’s decompose:
The --prefix
argument prepends a string prefix onto every file in the archive. Therefore, if you’d like the root directory to be named some-project
, then you prepend that string with a trailing slash, and you’ll have everything nested inside a directory!
The -o
flag predictably picks the output file and format. Using .tar.bz2
is quite common.
Lastly, the HEAD
portion at the end specifies which git tree to pull the files from. I usually specify a git tag here, but you can specify a commit id if you prefer.
This is all well and good, but unfortunately, when I open my newly created archive, it is notably missing my git submodules! It would probably make sense for there to be an upstream option so that a --recursive
flag would do this magic for you, but unfortunately it doesn’t exist yet.
There are a few scripts floating around that can do this, but I wanted something small, and without any real dependencies, that I can embed in my project Makefile
, so that it’s all self-contained.
Here’s what that looks like:
sometarget: @echo Running git archive... # use HEAD if tag doesn't exist yet, so that development is easier... git archive --prefix=oh-my-vagrant-$(VERSION)/ -o $(SOURCE) $(VERSION) 2> /dev/null || (echo 'Warning: $(VERSION) does not exist.' && git archive --prefix=oh-my-vagrant-$(VERSION)/ -o $(SOURCE) HEAD) # TODO: if git archive had a --submodules flag this would easier! @echo Running git archive submodules... # i thought i would need --ignore-zeros, but it doesn't seem necessary! p=`pwd` && (echo .; git submodule foreach) | while read entering path; do temp="$${path%'}"; temp="$${temp#'}"; path=$$temp; [ "$$path" = "" ] && continue; (cd $$path && git archive --prefix=oh-my-vagrant-$(VERSION)/$$path/ HEAD > $$p/rpmbuild/tmp.tar && tar --concatenate --file=$$p/$(SOURCE) $$p/rpmbuild/tmp.tar && rm $$p/rpmbuild/tmp.tar); done
This is a bit tricky to read, so I’ll try to break it down. Remember, double dollar signs are used in Make
syntax for embedded bash code since a single dollar sign is a special Make
identifier. The $(VERSION)
variable corresponds to the version of the project I’m building, which matches a git tag that I’ve previously created. $(SOURCE)
corresponds to an output file name, ending in the .tar.bz2
suffix.
p=`pwd` && (echo .; git submodule foreach) | while read entering path; do
In this first line, we store the current working directory for use later, and then loop through the output of the git submodule foreach
command. That output normally looks something like this:
james@computer:~/code/oh-my-vagrant$ git submodule foreach Entering 'vagrant/gems/xdg' Entering 'vagrant/kubernetes/templates/default' Entering 'vagrant/p4h' Entering 'vagrant/puppet/modules/module-data' Entering 'vagrant/puppet/modules/puppet' Entering 'vagrant/puppet/modules/stdlib' Entering 'vagrant/puppet/modules/yum'
As you can see, this shows that the above read
command, eats up the Entering string, and pulls the quoted path into the second path variable. The next part of the code:
temp="$${path%'}"; temp="$${temp#'}"; path=$$temp; [ "$$path" = "" ] && continue;
uses bash idioms to remove the two single quotes that wrap our string, and then skip over any empty versions of the path variable in our loop. Lastly, for each submodule found, we first switch into that directory:
(cd $$path &&
Run a normal git archive
command and create a plain uncompressed tar archive in a temporary directory:
git archive --prefix=oh-my-vagrant-$(VERSION)/$$path/ HEAD > $$p/rpmbuild/tmp.tar &&
Then use the magic of tar to overlay this new tar file, on top of the source file that we’re now building up with each iteration of this loop, and then remove the temporary file.
tar --concatenate --file=$$p/$(SOURCE) $$p/rpmbuild/tmp.tar && rm $$p/rpmbuild/tmp.tar);
Finally, we end the loop:
done
Boom, magic! Short, concise, and without any dependencies but bash
and git
.
Nobody should have to figure that out by themselves, and I wish it was built in to git, but until then, here’s how it’s done! Many thanks to #git
on IRC for pointing me in the right direction.
This is the commit where I landed this patch for oh-my-vagrant, if you’re curious to see this in the wild. Now that this is done, I can definitely say that it was worth the time:
With this feature merged, along with my automatic COPR builds, a simple ‘make rpm
‘, causes all of this automation to happen, and delivers a fresh build from git in a few minutes.
I hope you enjoyed this technique, and I hope you have some coding skills to get this feature upstream in git.
Happy Hacking,
James
2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...
It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...
The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...