Git: Merging multiple repositories into a mono-repo

How I converted my project of three repositories into one monorepo - without losing the git history!

tldr;

Skip to the next section, if you just want the instructions, and not my reasons for the change.

When I started with Gachou, I decided to create different repositories for web-ui, backend, installer, docs and e2e-tests. Having multiple repositories clearly has some advantages:

  • I have only one technology stack for each repo. It is either Node.js or Java. This means that for example, I can use husky to manage my pre-commit hooks for the Node.js repositories.
  • Pipelines are only executing for the repository I make changes in. Since GitLab has reduced their free pipeline time to 400 minutes in June 2022, I have to save pipeline time.

But it also has disadvantages. The deciding reason for me was:

I want to run unit, integration and e2e-tests in the GitLab pipeline. And I want to run them all before merging to the main branch. And although we often use trunk-based development at work, I think creating MRs is more appropriate for open-source projects. I don’t think you should rely on contributors being around after the merge, and 2e2-tests fail after the merge…

No, e2e-tests have ro run before the merge. So, if e2e-tests are in a different repository, you have to synchronize feature-branches. You have to branch the backend and the web-ui project to write the changes. And you have to branch the e2e-tests to write the new tests. This will quickly be messy and unintuitive. So I decided to move all projects into a single git-repository.

How to merge repos?

So I have different projects

gachou-web-ui
├── package.json
└── ...
gachou-backend
├── pom.xml
└── ...
gachou-installer
├─┬ e2e-tests/
│ ├── package.json
│ └── ...
└── gachou-docker-compose/

And I want to preserve the history for all repositories when converting it into the following structure:

gachou
├─ web-ui
│  ├── package.json
│  └── ...
├─ backend
│  ├── pom.xml
│  └── ...
├─ e2e-testing
│  ├── package.json
│  └── ...
└─ installer
   └── docker-compose

You’ll notice that there are some edge cases built-in: While web-ui and backend just should move into their own directories, e2e-testing is extracted from installer and put into its own directory, and gachou-docker-compose should be renamed.

We start by creating the new repo:

mkdir gachou
cd gachou
git init
touch README
git add README
git commit -m "add README"

Also, we will use the tool git-filter-repo to move files and rewrite history. So let’s install that first:

pip install git-filter-repo
export PATH="$HOME/.local/bin:$PATH"

Merging a repo into a subdirectory

For web-ui and backend, the process that we need to do now is:

  • Clone the sub-repo
  • Move all the files into a subdirectory, rewriting history.
  • Merge the sub-repo into the mono-repo, ignoring the fact, that the two repos have no common commits at all.

I use the following commands.


# Clone the sub-repo and enter it.
git clone git@gitlab.com:gachou/gachou-web-ui.git
# Move everything to subdirectory, rename possible tags
( cd gachou-web-ui && git filter-repo  --to-subdirectory-filter "web-ui" --tag-rename '':'web-ui-' )
# In  the mono-repo, add a "remote" pointing to the sub-repo
git remote add -f gachou-web-ui gachou-web-ui
# Merge the sub-repo into the mono-repo, ignoring missing common commits.
git merge -m "integrating web-ui" gachou-web-ui/main --allow-unrelated-histories
# Cleanup
rm gachou-web-ui -rf
git remote rm gachou-web-ui

For the gachou-backend-repository, I just replaced “web-ui” with “backend”.

Extracting single directory.

For the e2e-testing directory, the principle stays the same: I cloned the gachou-installer repo, but this time I used a slightly different command line for git filter-repo

git filter-repo  --path e2e-test/ --path-rename "e2e-test:e2e-testing" --tag-rename '':'e2e-testing-'

This only includes changes in the e2e-test directory, and rename that directory to e2e-testing.

Rename subdirectory

For the docker-compose directors in the gachou-installer, I used a combination of both commands

git filter-repo --to-subdirectory-filter "installer" \
  --path gachou-docker-compose \
  --path-rename="installer/gachou-docker-compose":"installer/docker-compose" \
  --tag-rename '':'installer'

This only includes the gachou-docker-compose directory, moves it inside installer and then renames it todocker-compose

What remains to be done…

I thought I was finished with the monorepo setup, until my pipelines failed. Some things had to be adjusted manually after the migration:

  • I added a new .gitlab-ci.yml file to the top directory and used parent-child pipelines to call the pipelines in the sub-repository.
  • I had to add the subdirectory to the path of each artifact:-directives, and inspect all cache:-directives in all .gitlab-ci.yml files.
  • I had to think of something for the pre-commit-hooks for linting and formatting. As I already wrote, was using husky in the Node.js projects, but… OK, this goes to far. Let’s catch up next week with this issue.