How to merge multiple Git repos and keep their commit history
Background
I maintain an Eleventy plugin, Embed Everything, which aggregates a bunch of other plugins. Until recently, I was keeping a separate Git repository for each package. It was getting to be a hassle, having to track issues, dependencies, and maintenance tasks across nine different repos. So after some experimentation, I decided to combine them all into a single monorepo. Here's how I did it.
This process is based largely on this blog post by Willem Cheizoo. I very slightly modified Willem's process, however, so I wanted to document it here.
Goals
- Merge several separate repos into one.
- Preserve the complete Git commit history. This was important because I wanted to ensure that volunteers continue to receive credit for their contributions to the codebase.
Overview
This is an overview of the basic steps. I'll go through each one in greater detail.
- Re-arrange each repo so they can merge without conflicts
- Add each repo as a remote source for the monorepo
- Merge the repos with
--allow-unrelated-histories
Prepare each repo for merging
The first task is to get each repo into the target file structure. It's important to do this ahead of time, to prevent merge conflicts.
- Checkout a working branch, so we can bail out safely:
$ git checkout -b monorepo-prep
- Create the target folder structure:
$ mkdir packages && mkdir packages/everything
- Move files into that new structure. I like using
git mv
:$ git mv package.json packages/everything
Repeat this step as necessary, until all the files are moved.
- Finally, commit all your changes:
$ git add -A
$ git commit -m "Prep for monorepo migration"
Repeat for each repo
Follow this same process for each repo that you're planning to merge. The goal is to ensure that they all have parallel file structures. If they don't, you're likely to get git conflicts that are a pain to reconcile.
In my case, I did the same thing for the youtube
repo:
We're now ready to merge the two repos into one.
Connect the repos
We'll allow these separate repos to communicate by connecting them using git remote
.
Typically, you'd use the git remote command to connect your local repo to an upstream repo over the network, such as GitHub. But you can use it to track any repo, including the ones on your local file system:
# ./everything
# -f = <f>etch the list of branches from the remote repo
# youtube = <name> for the remote repo
# ../youtube = <url> of the "remote" repo. In this case, just a relative path
$ git remote add -f youtube ../youtube
So the everything
repo is now tracking the youtube
repo as a remote source. You're now ready to merge these separate repos into one.
Merge the repos
We'll use git merge
to pull the youtube
repo into the everything
repo, including its complete commit history. The --allow-unrelated-histories
flag is what makes this whole thing work.
By default, a successful merge
command creates a commit. I like using the --no-commit
flag so I can confirm the new file structure before committing the changes, but it's optional.
# ./everything
# youtube/monorepo-prep = <remote repo name>/<branch name>
# --allow-unrelated-histories = It's OK to smoosh these repos together
# --no-commit = (Optional) Stage instead of committing
$ git merge youtube/monorepo-prep --allow-unrelated-histories --no-commit
A this point you can validate that the files were moved as expected:
└── everything
└── packages
├── everything
+ └── youtube
You can run git log --all
to check that the commit history was merged. (The --all
flag lets you view the merged commit history before completing the commit.) The two histories are merged together in reverse chronological order.
If you're satisfied that everything worked as expected, you can git commit
the results. At this point, you've successfully merged the two repos, including their complete Git commit history.
Cleanup
There are several things you might want to do at this point. There are some files in each package directory that are likely no longer needed. For example, you can keep a single .gitignore
file in the project root and delete all the individual ones in each package. You can also git remote rm youtube
, since you probably don't need that remote connection anymore. And there are other tasks required to set everything up as a monorepo, but that's a different post.
Speedrun
Here's a condensed run-through of this full procedure as it looks on the command line:
# ./
$ cd everything/
$ git checkout -b monorepo-prep
$ mkdir packages
$ mkdir packages/everything
$ git mv index.js packages/everything
# ... repeat the above step for all files
$ git add -A
$ git commit -m "Prepping monorepo"
# switch to the youtube repo
$ cd ../youtube
$ git checkout -b monorepo-prep
$ mkdir packages
$ mkdir packages/everything
$ git mv index.js packages/youtube
# ... repeat the above step for all files
$ git add -A
$ git commit -m "Prepping monorepo"
# switch back to the monorepo folder
$ cd ../everything
$ git remote add -f youtube ../youtube
$ git merge youtube/monorepo-prep --no-commit --allow-unrelated-histories
# Inspect that everything worked as expected. If so:
$ git add -A
$ git commit -m "Merging repos"
$ git remote rm youtube
# Done!