Table of contents

How to merge multiple Git repos and keep their commit history

Background

I maintain an Eleventy plugin, Embed Everything, which aggregates a bunch of other plugins. Until recently, I was keeping a separate Git repository for each package. It was getting to be a hassle, having to track issues, dependencies, and maintenance tasks across nine different repos. So after some experimentation, I decided to combine them all into a single monorepo. Here's how I did it.

This process is based largely on this blog post by Willem Cheizoo. I very slightly modified Willem's process, however, so I wanted to document it here.

Goals

  1. Merge several separate repos into one.
  2. Preserve the complete Git commit history. This was important because I wanted to ensure that volunteers continue to receive credit for their contributions to the codebase.
# Before: two repos
├── everything
└── youtube
# After: two packages, one repo
└── monorepo
    └── packages
        ├── everything
        └── youtube

Overview

This is an overview of the basic steps. I'll go through each one in greater detail.

  1. Re-arrange each repo so they can merge without conflicts
  2. Add each repo as a remote source for the monorepo
  3. Merge the repos with --allow-unrelated-histories

Prepare each repo for merging

The first task is to get each repo into the target file structure. It's important to do this ahead of time, to prevent merge conflicts.

  1. Checkout a working branch, so we can bail out safely:
    $ git checkout -b monorepo-prep
  2. Create the target folder structure:
    $ mkdir packages && mkdir packages/everything
  3. Move files into that new structure. I like using git mv:
    $ git mv package.json packages/everything
    Repeat this step as necessary, until all the files are moved.
# Before: 
└── everything
    ├── README.md
    ├── index.js
    ├── package.json
    └── ...
# After: 
└── everything 
    └── packages
        └── everything
            ├── README.md
            ├── index.js
            ├── package.json
            └── ...
  1. Finally, commit all your changes:
    $ git add -A
    $ git commit -m "Prep for monorepo migration"

Repeat for each repo

Follow this same process for each repo that you're planning to merge. The goal is to ensure that they all have parallel file structures. If they don't, you're likely to get git conflicts that are a pain to reconcile.

In my case, I did the same thing for the youtube repo:

# Before: 
└── youtube
    ├── README.md
    ├── index.js
    └── package.json
# After: 
└── youtube 
    └── packages
        └── youtube
            ├── README.md
            ├── index.js
            └── package.json

We're now ready to merge the two repos into one.

Connect the repos

We'll allow these separate repos to communicate by connecting them using git remote.

Typically, you'd use the git remote command to connect your local repo to an upstream repo over the network, such as GitHub. But you can use it to track any repo, including the ones on your local file system:

# ./everything
# -f          = <f>etch the list of branches from the remote repo
# youtube     = <name> for the remote repo
# ../youtube  = <url> of the "remote" repo. In this case, just a relative path

$ git remote add -f youtube ../youtube

So the everything repo is now tracking the youtube repo as a remote source. You're now ready to merge these separate repos into one.

Merge the repos

We'll use git merge to pull the youtube repo into the everything repo, including its complete commit history. The --allow-unrelated-histories flag is what makes this whole thing work.

By default, a successful merge command creates a commit. I like using the --no-commit flag so I can confirm the new file structure before committing the changes, but it's optional.

# ./everything
# youtube/monorepo-prep       = <remote repo name>/<branch name>
# --allow-unrelated-histories = It's OK to smoosh these repos together
# --no-commit                 = (Optional) Stage instead of committing

$ git merge youtube/monorepo-prep --allow-unrelated-histories --no-commit

A this point you can validate that the files were moved as expected:

└── everything
    └── packages
        ├── everything
+       └── youtube

You can run git log --all to check that the commit history was merged. (The --all flag lets you view the merged commit history before completing the commit.) The two histories are merged together in reverse chronological order.

If you're satisfied that everything worked as expected, you can git commit the results. At this point, you've successfully merged the two repos, including their complete Git commit history.

Cleanup

There are several things you might want to do at this point. There are some files in each package directory that are likely no longer needed. For example, you can keep a single .gitignore file in the project root and delete all the individual ones in each package. You can also git remote rm youtube, since you probably don't need that remote connection anymore. And there are other tasks required to set everything up as a monorepo, but that's a different post.

Speedrun

Here's a condensed run-through of this full procedure as it looks on the command line:

# ./
$ cd everything/
$ git checkout -b monorepo-prep
$ mkdir packages
$ mkdir packages/everything
$ git mv index.js packages/everything
  # ... repeat the above step for all files
$ git add -A
$ git commit -m "Prepping monorepo"
# switch to the youtube repo
$ cd ../youtube
$ git checkout -b monorepo-prep
$ mkdir packages
$ mkdir packages/everything
$ git mv index.js packages/youtube
  # ... repeat the above step for all files
$ git add -A
$ git commit -m "Prepping monorepo"
# switch back to the monorepo folder
$ cd ../everything
$ git remote add -f youtube ../youtube
$ git merge youtube/monorepo-prep --no-commit --allow-unrelated-histories
# Inspect that everything worked as expected. If so:
$ git add -A
$ git commit -m "Merging repos"
$ git remote rm youtube
# Done!
Tagged: git