I recently needed to combine several Git repositories into a single one, with each old repo living in a subdirectory of the new repo. I could simply copy the files over manually, importing the contents of each project in a single commit, but I'd lose the commit history of each subproject. After some brief searching, I found this helpful article, which describes how to import the old repos, complete with their commit history, into the new repo.
So suppose you have two repos:
mainproj. If you want
mainproj in a directory called
the method consists of two steps:
Move all the files in the root of
subprojinto a directory called
Merge the modified
mainproj, thereby putting the
It turns out the first step is harder than the second, since it
involves changing all the commits in the
subproj repo's history to
act on files located in the new subdirectory, rather than in the
project's root. The tool we'll use to do this is
It lets you rewrite the project's revision history, similar to how
git rebase modifies your commits as it "replays" them on a new
branch. But where
git rebase re-orders existing the commits,
filter-branch lets you run a shell script before re-applying each
commit. In our case that script will be to move all the files into
the new subdirectory.
Note that all the precautions that apply to
git rebase also apply to
git filter-branch. If you're changing a repository's commits, you
can't expect to push them back upstream. So these manipulations are
best done on projects you haven't shared yet, or, as is the case here,
that you plan to delete once you've merged them in elsewhere..
The procedure to move the contents of
subproj's root into a
subdirectory is as folows.
$ git clone subproj subproj_tmp $ cd subproj_tmp/ $ git filter-branch -f --prune-empty --tree-filter ' > mkdir -p .sub; > mv * .sub; > mv .sub sub > ' -- --all
Let's break this down. We're using the
-f switch to force
filter-branch to continue in situations where it may abort, such as
if there are temporary directories, etc. The
tells it to skip empty commits, which may result from the application
of the filter. This is unlikely in our case, but we may as well leave
--tree-filter switch is the meat of the command. It's argument
is a shell script executed in the root of the repository before the
re-application of each commit. The "
-- --all" arguments specify
that our filter is to be applied to all branches and tags.
It's worth noting that the
--tree-filter option does not honor any
.gitignore rules when creating the new commits, so "ignored" files
may find their way back into the commits if they are present in the
working repo. We avoided this by working in a fresh clone.
So before each commit is re-applied, we're creating the
directory, moving all files in the project's root into that directory,
then renaming it to
sub/. We need to create the intermediate
.sub/ directory because otherwise
mv * would try to move
into itself and cause an error. But
mv ignores hidden files, so the
above method works.
A downside to this strategy is that any hidden files in the root of your
project, such as
.gitignore, will be skipped. We address this issue
by simply moving these files into the subdirectory manually, and
committing the change.
$ git mv .gitignore sub/ $ git commit -am "Move .gitignore into subdirectory."
The repo now has a new
sub/ directory, but it also still has the
original files in the project root. These are untracked, however, so
they can be ignored for our purposes.
subproj_tmp, we use the
git gc command to delete
loose objects, etc.
$ git gc --aggressive
We can now merge
$ cd ../mainproj $ git remote add subproj ../subproj_tmp $ git fetch subproj $ git merge subproj/master
Here we added the
subproj_tmp repo as a new remote for
fetched it and merged it in. Since all of
live in the
sub/ directory, the result of this merge is simply to
sub/ directory to
We can now delete the remote we created, clean up, and push
mainproj to it's origin.
$ git remote rm subproj $ git gc --aggressive $ git push origin master
It's probably a good idea to delete our working
subproj repo and
archive the original, just in case.
$ cd .. $ rm -rf subproj_tmp/ $ mkdir archive $ mv subproj/ archive/
And that's how you make one repo a subproject of another one, while maintaining the commit history of both.
Version control software seems to follow the Pareto principle — you get 80% of the benefits by learning 20% of the features. The downside is you tend to never get around to learning that other 80%, which can be useful in a pinch. Problems like this are a good excuse to further explore that 80%.