Git mirroring

Not many people chose to run their own gitlabs instance these days. My preference for self reliance means that I do. If you value self reliance I have recommendations:

  • Use ansible, chef, or puppet to build your gitlab instance because you are going to build two.
  • Build one gitlabs server for your groups consumption. Put this one in a data center close to your user for good performance.
  • Build a second gitlabs server in a remote location, perhaps at your favorite cloud provider. Where ever the second gitlabs instance is, you’ll want either one way or bidirection access via https or ssh between the two servers.
  • Follow the directions in your gitlabs: Help -> User Documentation -> Mirror a Repository. To mirror the repository from the primary to the secondary.

At this point, you’ve created a great plan B for disaster recovery in case something terrible happens to your gitlabs. For me, gitlabs is storing the Terraform and Ansible that I use to build my infrastructure. The goal is to be able to jumpstart your whatever from the mirror. I called the mirror my plan be because my plan A is to directly restore gitlabs from a nightly backup.

Setting up the mirror

Setting up the mirror is well documented. In broad strokes, here are the steps:

  • On the mirror, create a group and project to hold the mirrored repository.
  • Choose Push or Pull mirroring. In Push, the primary will push updates to the secondary as you work. In Pull, the mirror will periodically poll the primary for changes. You’ll have to decide what works best for you.
  • Fill out the form and perform any needed setup. When using push over ssh, this means setting up the primary to push and then copying the ssh public key from the primary and adding it as an allowed key on the mirror user.

As you configure mirroring, remember that constructing the mirror URL can be tricky especially if you want to use ssh as transport. This is because a typical git cloning string looks like this: git@git.example.com:group/project.git but the mirroring URL for this is: ssh://git@git.example.com/group/project.git. The difference lies between the server, git.example.com, and path. When cloning, the separator is a colon, ‘:’. When mirroring, it’s a slash, ‘/’. Getting the authentication can also be tricky. Mirroring more than a few repositories using SSH can become tricky because gitlabs generates a new ssh key for each repository. This is one place of the few places where I like git+https more than git+ssh. Finally, git+https is not without its pitfalls. If like me you also have your own CA then you have the additional problem that git doesn’t do a good job configuring curl’s CA. You have two choices here. On the box initiating the transfer, run: git config --global http.sslCAPath my-ca-path to use your CA dir or git config --global http.sslCAInfo my-ca-file.pem to configure your CA file. One advantage of git+https in this configuration is that you can create a single user token for all of your mirroring. My concern with git+ssh here is the proliferation of keys may eventually cause git to fail with a “too many authentication attempts” error. With git+http, you can create one mirroring token for all mirror operations.

Once you’ve setup mirroring, you have a great plan B if the day ever comes that your gitlabs server becomes unavailable. You should have a working gitlabs mirror that you can use in any way that you please. You can even pull backups of the mirror server so you have a redundant, offsite-backup.

On FreeBSD, git can’t find the certificate store

When I was playing with git checkout of modules I discovered that git doesn’t know how to set the certificate store for curl when it tries to retrieves a module via https. In general, I don’t recommend using git with https unless you have to. Using git+ssh obviates away a bucket of authentication issues. In this case, https is the better choice. To tell git where to look for certificates, to verify and https website, I had to add the following to my ~/.gitconfig:

[http]
sslCApath=/etc/ssl/certs

The command that does this is: git config --global http.sslCAPath "/etc/ssl/certs". If your operating system uses a CA file rather than a CA directory this is the setting: git config --global http.sslCAInfo "/etc/ssl/cert.pem". You can also make this work by setting an environment variable for curl in /etc/profile.

Mirroring in Gitlab

I normally strongly prefer git+ssh over git+https. If you are mirroring between two gitlab-ce instances over git+https, you can handle your mirroring with a single authentication token.

Pip + git for development

I’m working on what should be a simple raspberry pi display project and I came up with the need for a set of ad-hoc python modules that were installable from my gitlab server. It was a bit of a journey. Here are the broad steps:

  • Create a gitlabs project for your python module. Since will probably have a few of these, it might be good to make a group for them right now.
  • I think that you can use git+ssh://git@gitlabs.example.com... for this but I chose to use a gitlabs impersonation token for this since ssh isn’t installed everywhere and sometimes the installation needs a bunch of hints in ~/.ssh/config.
  • A standard install will be done with pip as follows pip install git+https://{user}:{password}@git.example.com/example-group/example-project.git. If you created a gitlabs impersonation token about, you can substitute it for password here.

Sometimes I need to edit the installed package that I’m working on. The way to do this is to use the –editable flag to pip. To do that you need to specify some extra information to git when checking out the project. I found that this command line:

pip install --editable git+https://{user}:{token}@gitlabs.example.com/example-group/example-project.git#egg={module_name}

I think that the #egg={module_name} piece provides pip with name of the module as installed. I found the documentation that explains this here: “https://pip.pypa.io/en/stable/topics/vcs-support/”. Assuming that you are doing this in a venv, and it doesn’t make sense not to, you’ll get a new directory called venv/src/{module_name} which has a git checkout of your module so you can edit it to your needs for this particular project.