1443 words, ~8 min read

Hosting my Git Repositories

For a long while now it has become more and more evident that you just can't trust companies to host your data. This is true even if you are paying for said services. This has become even more obvious with the introduction of large language models, etc.

Additionally, the products that exist in the space of Git hosting are less than ideal in my opinion. Instead of them focusing on making Git hosting as good as it possibly could be. They are focused on building AI platforms that integrate with all your Git repositories. Which I have zero interest in when I am paying for Git hosting. Beyond that these companies are built on a software development model that is clearly less than ideal but they have no interest in rethinking that model.

For these reasons and more I have decided it probably makes the most sense to simply host all my Git repositories myself.

The following are my notes around how I accomplished this so that if anyone else wants to do the same they would hopefully be able to figure out.

What we need

Basically, we need a server to host the repositories and run the software. You can of course pay some VPS provider like Digital Ocean or you can run your own server at your house or wherever you want. Note: If you run it at your house you will likely need to figure out some sort of Dynamic DNS. I believ Cloudflare provides this as a service. But I am sure there are others.

Then you need to have an operating system. I personally prefer Arch Linux so I have a server running Arch Linux.

We are going to be hosting our Git repositories over the SSH protocol with the help of a tool called Gitolite. Gitolite facilitates authenticating users and managing access to the Git repositories over SSH public/private key pairs.

The Setup

This portion assumes you already have an Arch Linux system setup for your server.

Get SSH Daemeon Setup

First thing we need to do on that server is installed the openssh package so that we will have the ssh daemon installed.

sudo pacman -Sy openssh

Next we want to start the ssh daemon.

sudo systemctl start sshd

Then we want to enable it so that it starts up at boot.

sudo systemctl enable sshd

We can then verify it is running with the following.

systemctl status sshd

Then I configured sshd by editing /etc/ssh/sshd_config by updating it so that the following are set. Explanations of what they do are in comments above them all below. This is effectively to harden sshd as may be exposed to the internet directly and it needs to be as secure as possible.

# Enable public key authentication
PublickeyAuthentication yes

# Tell it where to find the file containing authorized keps
AuthorizedKeysFile .ssh/authorized_keys

# Prevent password authentication
PasswordAuthentication no

# Prevent challenge response auth
ChallengeResponseAuthentication no

# Disable PAM authentication
UsePAM no

# Prevent root login
PermitRootLogin no

# Prevent keyboard interactive auth
KbdInteractiveAuthentication no

# Don't allow any empty passwords
PermitEmptyPasswords no

# Restrict allowed users
AllowUsers youruser gitolite

# Strong key exchange algorithms
KexAlgorithms curve25519-sha256,[email protected]

# Strong ciphers
Ciphers [email protected],[email protected]

# Strong MACs
MACs [email protected],[email protected]

# Only allow modern host keys
HostKeyAlgorithms ssh-ed25519

# Disable agent & TCP forwarding unless you need it
AllowAgentForwarding no
AllowTcpForwarding no
X11Forwarding no

# Disable tunneling unless needed
PermitTunnel no

# Disable rhosts-style trust
IgnoreRhosts yes
HostbasedAuthentication no

# Limit auth attempts
MaxAuthTries 3

# Limit concurrent connections
MaxSessions 2
MaxStartups 10:30:60

# Drop idle connections
ClientAliveInterval 300
ClientAliveCountMax 2

Then I restarted the sshd to have the changes take place.

sudo systemctl restart sshd

After this I simply scp'd my public ssh key over to the server and added it to the authorized_keys file as follows so that I could easily ssh over to the server and manage things.

cat ~/.ssh/id_ed25519.pub >> ~/.ssh/authorized_keys

Then I simply made sure that the file had the correct permissions.

chmod 600 ~/.ssh/authorized_keys
chown -R user:user ~/.ssh

Install & Setup Gitolite

First I installed the dependencies as follows.

sudo pacman -Sy base-devel git

Then I installed Gitolite as follows.

sudo pacman -Sy gitolite

The above created the gitolite user on the system and setup /var/lib/gitolite as it's home directory.

Then simply needed to initialize Gitolite. In order to do this I ran the following.

sudo -u gitolite gitolite setup -pk path/to/my/public/ssh/key

This setup the gitolite-admin repository as well as a testing repository. The gitolite-admin repository is how you configure and manage Gitolite. So you simply clone it and make changes to the configs in there and commit and push it up. It then handles creating repositories appropriately for you with the configured access controls.

Which supports what you need for private repositories as well as anonymous (unauthenticated) repositories which are useful for opensource.

What about Off-site Backups

One side benefits I was getting from the services I was leaving was off-site backups. So, I figured I might as well address that in my setup as well.

Turns out there is a pretty grate backup tool called Restic that integrates with Backblaze B2 that allows you to configure backups to be done and stored in a Backblaze B2 bucket pretty easily.

First I had to decide what I wanted to backup. In my case I simply wanted the following.

/var/lib/gitolite/repositories
/var/lib/gitolite/.gitolite

Then I installed Restic as follows.

sudo pacman -Sy restic

Then I had to go and create my Backblaze B2 bucket and an Application Key, which I captured the bucket name, key id, and key itself and did the following.

sudo mkdir -p /etc/restic
sudo nvim /etc/restic/gitolite.env

I then created filled /etc/restic/gitolite.env with the following content.

export RESTIC_REPOSITORY="b2:BUCKET_NAME:/gitolite"
export RESTIC_PASSWORD="use-a-long-random-passphrase"

export B2_ACCOUNT_ID="your-application-key-id"
export B2_ACCOUNT_KEY="your-application-key"

Then I set the perms on that directory and file as follows.

sudo chmod 600 /etc/restic/gitolite.env
sudo chown root:root /etc/restic/gitolite.env

Then I initialized the restic repository.

sudo -i
source /etc/restic/gitolite.env

restic init

Then I ran a manual backup to test things out.

restic backup \
  /var/lib/gitolite/repositories \
  /var/lib/gitolite/.gitolite
restic snapshots

Then I added an exclusion file to exclude unneccessary files.

nvim /etc/restic/gitolite.exclude

and gave it the following content.

**/tmp
**/*.lock
**/lost+found

Then I tested manual backup again with it.

restic backup \
  --exclude-file /etc/restic/gitolite.exclude \
  /var/lib/gitolite/repositories \
  /var/lib/gitolite/.gitolite

Then I came up with a simple retention policy and tested it as follows.

restic forget \
  --keep-daily 14 \
  --keep-weekly 8 \
  --keep-monthly 12 \
  --prune

This removes old snapshots, etc. based on the retention policy.

Then I created a backup script so that I could automate this.

nvim /usr/local/bin/restic-gitolite-backup.sh

with the following content.

#!/usr/bin/env bash
set -euo pipefail

source /etc/restic/gitolite.env

restic backup \
  --exclude-file /etc/restic/gitolite.exclude \
  /var/lib/gitolite/repositories \
  /var/lib/gitolite/.gitolite

restic forget \
  --keep-daily 14 \
  --keep-weekly 8 \
  --keep-monthly 12 \
  --prune

Then I made it executable as follows.

chmod +x /usr/local/bin/restic-gitolite-backup.sh

Then I registered it as a systemd service by creating the following.

nvim /etc/systemd/system/restic-gitolite-backup.service

with the following content.

[Unit]
Description=Restic backup of Gitolite repositories
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/restic-gitolite-backup.sh

Then I created a systemd timer to run it automatically.

nvim /etc/systemd/system/restic-gitolite-backup.timer

with the following content.

[Unit]
Description=Daily Restic Gitolite backup

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target

Then I simply enabled it as follows.

systemctl daemon-reload
systemctl enable --now restic-gitolite-backup.timer

And verified that it was properly registered and running with the following.

systemctl list-timers

Finally I tested the backups to make sure I could restore from them as follows.

restic snapshots
restic restore latest --target /tmp/restore-test

In theory you should be able to restore an individual repo as follows. But I have not tested this yet.

restic restore latest \
  --include /var/lib/gitolite/repositories/myrepo.git \
  --target /tmp/restore-test

OpenSource Repositories

Setup hosting of anonymous repositories with git:// protocol which doesn't facilitate any authentication. I do this via the git-daemon which comes with git. It also has the benefit of playing nicely with Gitolite and therefore allows sharing the same repositories store.

To do this I created a systemd service description for it.

/etc/systemd/system/git-daemon.service

with the content

[Unit]
Description=Git Daemon (anonymous read-only)
After=network.target

[Service]
User=gitolite
Group=gitolite
ExecStart=/usr/lib/git-core/git-daemon \
  --reuseaddr \
  --base-path=/var/lib/gitolite/repositories \
  --disable=receive-pack \
  --informative-errors \
  --verbose
Restart=on-failure

[Install]
WantedBy=multi-user.target

Then I enabled and started the service as follows.

sudo systemctl daemon-reload
sudo systemctl enable git-daemon
sudo systemctl start git-daemon
sudo systemctl status git-daemon

Then I simply exposed the git-daemon port 9418 in the firewall so that externally people would be able to access the opensource repositories.

Dynamic DNS

I am using Cloudfare for DNS zone hosting. So I whipped up the following little script ~/bin/update_dns_record_with_public_ip, to dynamically set the IP address.

#!/usr/bin/env bash
set -euo pipefail

ZONE_ID=
RECORD_ID=
CLOUDFLARE_API_TOKEN=

SUB_DOMAIN=

TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
LOGFILE="$HOME/logs/cloudflare-ddns.log"

# Fetch the RECORD_ID of the subdomain. This only has to be done initially to get the value for RECORD_ID
# So just uncomment this, save the file, and run the script to get the RECORD_ID. Then simply update the
# script with the RECORD_ID value, recomment these lines, and save the script and you should be good to go.
#
# curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records?name=$SUB_DOMAIN" \
#  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"
# exit 0

# Fetch the public IP address using ifconfig.me
IP_ADDRESS=$(curl -s ifconfig.me)

# Get the IP address Cloudflare currently has for the subdomain
DNS_IP=$(curl -s \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  -H "Content-Type: application/json" \
  "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$RECORD_ID" \
  | jq -r '.result.content')

# 3. Compare existing DNS record's IP addr to the current public IP addr
if [[ "$IP_ADDRESS" == "$DNS_IP" ]]; then
  echo "$TIMESTAMP - IP unchanged ($IP_ADDRESS). No update needed." >> "$LOGFILE"
  exit 0
fi

RESPONSE=$(curl -X PUT "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$RECORD_ID" \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  -H "Content-Type: application/json" \
  --data "{
    \"type\": \"A\",
    \"name\": \"$SUB_DOMAIN\",
    \"content\": \"$IP_ADDRESS\",
    \"ttl\": 120,
    \"proxied\": false
}")

# Check success
SUCCESS=$(echo "$RESPONSE" | jq -r '.success')
if [[ "$SUCCESS" == "true" ]]; then
  echo "$TIMESTAMP - DNS update successful ($IP_ADDRESS)" >> "$LOGFILE"
else
  ERR_MSG=$(echo "$RESPONSE" | jq -r '.errors[]?.message')
  echo "$TIMESTAMP - DNS update failed: $ERR_MSG" >> "$LOGFILE"
fi

This script expects your user's hame directory to also have a logs directory to be able to write it's logs out to.

This script fetches the public IP address using ifconfig.me and then uses the Cloudflare API to update the subdomain.

In order for this to work you need to create an API Key that is setup for Edit zone. This can be done from the Cloudflare -> Avatar -> Profile -> API Keys.

Then to use this you obviously would need to fill in the variables at the top of the script. The ZONE_ID can be obtained by selecting the zone in the Cloudflare UI and looking in the right sidebar for the API section and grabbing the zone id value. The RECORD_ID I believe is only available by using the API call that is commented out near the top of the script.

Once I had the script setup I was able to test it by just running it. But, I don't want to manually do this. I want it to automatically be run.

So I installed cronie with the following.

sudo pacman -Sy cronie vi
mkdir -p ~/.cache
sudo systemctl enable cronie.service
sudo systemctl start cronie.service

Then I edited my users crontab via crontab -e and gave it the following content.

*/5 * * * * /home/YOUR-USERNAME/bin/update_dns_record_with_public_ip >> /home/YOUR-USERNAME/logs/cloudflare-ddns-cron.log 2>&1
0 0 * * * truncate -s 0 /home/YOUR-USERNAME/logs/cloudflare-ddns.log

The above registers two cron jobs. The first one runs the script every 5 mins to make sure that the DNS record is up to date. The second one runs once every day and truncates the log file so it does not get out of hand.

I then waited and checked the log files to make sure everything was working as expected.

Don't also forget that you need to likely modify your router configuration to forward port 22 from the outside of your network to your server's internal IP address. That is if you aren't using a hosted VPS. If you are using a hosted VPS you shouldn't need this entire Dynamic DNS section at all as you generally get a static IP address.

What about PRs, comments, etc.

Personally, I don't want or need any of those things. There are plenty of tools that can be used to facilitate those types of interactions without the need for the Git hosting to do them. A very basic one is the concept of patches and emails to say the least for open source repositories.