Git mirroring

Not many people chose to run their own gitlabs instance these days. My preference for self reliance means that I do. If you value self reliance I have recommendations:

  • Use ansible, chef, or puppet to build your gitlab instance because you are going to build two.
  • Build one gitlabs server for your groups consumption. Put this one in a data center close to your user for good performance.
  • Build a second gitlabs server in a remote location, perhaps at your favorite cloud provider. Where ever the second gitlabs instance is, you’ll want either one way or bidirection access via https or ssh between the two servers.
  • Follow the directions in your gitlabs: Help -> User Documentation -> Mirror a Repository. To mirror the repository from the primary to the secondary.

At this point, you’ve created a great plan B for disaster recovery in case something terrible happens to your gitlabs. For me, gitlabs is storing the Terraform and Ansible that I use to build my infrastructure. The goal is to be able to jumpstart your whatever from the mirror. I called the mirror my plan be because my plan A is to directly restore gitlabs from a nightly backup.

Setting up the mirror

Setting up the mirror is well documented. In broad strokes, here are the steps:

  • On the mirror, create a group and project to hold the mirrored repository.
  • Choose Push or Pull mirroring. In Push, the primary will push updates to the secondary as you work. In Pull, the mirror will periodically poll the primary for changes. You’ll have to decide what works best for you.
  • Fill out the form and perform any needed setup. When using push over ssh, this means setting up the primary to push and then copying the ssh public key from the primary and adding it as an allowed key on the mirror user.

As you configure mirroring, remember that constructing the mirror URL can be tricky especially if you want to use ssh as transport. This is because a typical git cloning string looks like this: git@git.example.com:group/project.git but the mirroring URL for this is: ssh://git@git.example.com/group/project.git. The difference lies between the server, git.example.com, and path. When cloning, the separator is a colon, ‘:’. When mirroring, it’s a slash, ‘/’. Getting the authentication can also be tricky. Mirroring more than a few repositories using SSH can become tricky because gitlabs generates a new ssh key for each repository. This is one place of the few places where I like git+https more than git+ssh. Finally, git+https is not without its pitfalls. If like me you also have your own CA then you have the additional problem that git doesn’t do a good job configuring curl’s CA. You have two choices here. On the box initiating the transfer, run: git config --global http.sslCAPath my-ca-path to use your CA dir or git config --global http.sslCAInfo my-ca-file.pem to configure your CA file. One advantage of git+https in this configuration is that you can create a single user token for all of your mirroring. My concern with git+ssh here is the proliferation of keys may eventually cause git to fail with a “too many authentication attempts” error. With git+http, you can create one mirroring token for all mirror operations.

Once you’ve setup mirroring, you have a great plan B if the day ever comes that your gitlabs server becomes unavailable. You should have a working gitlabs mirror that you can use in any way that you please. You can even pull backups of the mirror server so you have a redundant, offsite-backup.