Open Source – tildeChris

August 24, 2025September 9, 2025

ZFS on Virtual Machines

I’ve grown to love ZFS. But using on Virtual Machines did cause me to learn a new trick. Virtual Machines allow you lously allocate disk space which can be very helpful but managing this doesn’t always play nice with ZFS.

Virtual Machines and Thin Provisioning

Virtual Machine products usually allow a feature that called “Thin Provisioning” under the VMware family of products. In detail, when you provision a “disk drive” for your VM, you can just designate the size of the drive without actually claiming any space for it. So, you can thin provision a 120GB drive and only pull a few kilobytes from the filesystem for the hypervisor to manage things. This works because the drive will return a set of zeros when you read a sector that hasn’t been previously written. This is all built on a Unix feature called Sparse files. In Unix, a sparse file behaves exactly this way. If you read a block in the file that hasn’t ever been written to, you get zeros without the Unix having to do anything. When you write that sector, the block is actually saved to the disk and the next time you do a read, you’ll get what you wrote last time. In VMware this is handy because your VM’s disk drive only stores the stuff that you’ve written.

Day to day use

As you use this thinly provisioned drive you’ll write data and the hypervisor will store it and all work work great. What’s interesting is what happens when you delete a file. Deleting a file won’t zero the sectors that it use, and that wouldn’t matter any how because a sparse file in Unix differentiates between a block of zeros in an unwritten portion of the file and a block that you filled with zeros through a write operation. The second block of zeros actually takes up space. It’s important to note that as far as the hypervisor is concerned, a block of zeros in a thin provisioned drive never needs to take up space.

Reclaiming space

After a period of use, a normal filesystem will have a bunch of block that are written, and filled with contents that used to be a part of files that have since been deleted. If the disk drive that we’re talking about is a sparse file on a VMware server this file will take up more and more space up until you actually “fill up the drive”. A standard method for reclaiming this space on older filesystems was to overwrite the entire unallocated part of the drive with zeros and ask VMware to re-examine the virtual drive, punching holes where any large segment of zeros is found.

Where ZFS fits in

Older filesystems like UFS and EXT4 generally don’t support advanced operations like automatic filesystem object compression and deduplication. On these systems writing a block of zeros to a file will generally actually write a block of zeros to the disk drive. On ZFS, where compression and deduplication are available, writing a block of zeros need not write any data at all. In effect you can’t write a block of zeros to a file on a host using zfs.

Reclaiming space normally

To reclaim space from EXT4 or UFS the general procedure is simple. Within the running VM:

## Shut the system down to single user mode

# for fs in <mounted-filesystems>; do
>     dd if=/dev/zero of="${fs}/.zerofill" bs=26214400
>     rm "${fs}/.zerofill"
> done

## Now all the unallocated space should have been zeroed out
## shutdown the VM

Now on the host run whatever tool you need to reclaim space from the VM. This usually involves having the hypervisor or a utility read the virtual disk block by block discarding and blocks that are completely filled with zeros. In VMware Fusion, this is the utility:

# vmware-vdiskmanager -k <virtual-disk-file>

It’s usually that simple. When the filesystem is ZFS, things are different. For one thing, several disk mounting points will exist as a part of a zfs pool. You only need to fill one “filesystem” on a pool to zero out all of the unallocated space. More importantly though with compression and dedup on, you won’t be able to write zeros to the virtual drive. Here’s the procedure on a machine with zfs:

## Again, shut the system down to single user mode.

# for p in <affected-zfs-pools>; do
>    zfs set compression=off dedup=off "${p}"
>    dd if=/dev/zero of=<pool-mountpoint>/.zerofill bs=26214400
>    rm <pool-mountpoint>/.zerofill
>    zfs set compression=on dedup=on "${p}"
> done
# shutdown -h now

At this point the machine you should be able to reclaim the zeroed space using the hypervisors tools as before.

July 9, 2025August 24, 2025

Frontier Fiber / Static IP

After many years I decided to pull the trigger on getting a static IPv4 address. The biggest factor in the decision was my wife’s LLC. For Frontier Fiber here are the steps:

Like many other ISPs, Frontier only gives out static IP on “business” grade connections. Without being negative, the first question you get when upgrading from a residential to a business connection is: “What’s the difference?” or “Is there a difference between a residential account and a business account?”. There may be a difference in service level but I can’t see it enumerated anywhere in the service agreements. For me the change was accomplished without changing any equipment. The important thing is that Frontier like Optimum and possibly Xfinity, will only give out static IP on a business account so you have to “upgrade”.
That was the first part of the process for me. I had to work with Frontier to convert my account from residential to business. From my perspective this looks like moving my file from one file cabinet to another. Further, when I called to manage the process, I always got a customer service rep who handled residential account so I had to wait to be connected to a rep that worked the “business” side of the house.
Once the work order was finished, I was able to setup an appointment to get my ONT reconfigured from residential/DHCP to business/static. The tech showed up this morning. Frontier truly does static IP. As I understand things, they do not use a BOOTP style process where your equipment can learn a static IP assignment via the protocol for DHCP. I wish they did that but they don’t. If I’m honest with myself, I have several servers that get their IP address via DHCP but the address they get is statically tied to a particular MAC address, BOOTP style. Honestly, I find that there are places on my network where this whole thing is brittle and I control the whole thing. It’s probably for the best that they force me to configure things statically.

That’s pretty much it. In my case, I had some cleanup work that had to be done on the account before conversion from residential to business and that delayed the process for me by a few days. But, so long as Frontier stays up, I expect to have the same IP address until I change ISPs. The Frontier techs where tremendously professional during this whole process, they deserve praise for the way this was handled.

I’ll write a future article about how I do ISP failover on OpenBSD one time. This change modifies that process.

July 2, 2025July 2, 2025

Email Deliver-ability

Way back in the day in 1996, I remember attending a Birds of a feather session at the USENIX technical conference, on email and spam. The people in the room railed at the spam problem and it was clear that the leaders were taking the spam as a personal attack. I sat quietly in the room, silently noting to myself that none of the proposed solutions, not even adding extensions to the SMTP protocol, were going to stop the growing commercialization of email as a medium. This is because any magic dust that you can sprinkle on email to mark is as trustworthy and not spam, can be and will be ruthlessly adopted by commercial senders to increase their own deliver-ability.

Increasing deliver-ability

I just added DKIM signing to messages that come from vindaloo.com. I did this because I added a new domain to my mail server so I could support my wife’s LLC: moderncrc.com. Honestly, I might have been better off outsourcing this to Purely Mail and if you are here trying to figure out how to set up mail for your own domain, I say that for 90% of people, outsourcing to someone like Purely Mail is the right way to go.

For self hosters and smaller companies, considering hosting there own email, consider the fact that deliver-ability will be your biggest problem. This means that getting other people to accept mail from you and not automatically treat it as spam to be quarantined, rather than read, is the biggest hurdle you will have to get over. In the modern internet, achieving deliver-ability means jumping through a few hoops.

You need to get an IPv4 address that hasn’t been fouled by someone using it to send spam. When these addresses get fouled, they get enumerated onto lists called RBLs or real-time blackhole lists. These are DNS based lists that say, this IP address could be, a source of spam. This isn’t generally difficult but it means that you won’t ever be able to send SMTP mail from an end-user internet connection such as an Xfinity or FiOS internet account. And being clear, I mean across cable, fiber, and business, or residential. The best way past this hurdle is to setup your outgoing SMTP server on a VPS from someone like vultr.com. After this you’ll probably need to put in a support request to be allowed to send mail at all. Of course, this pretty much means that you need to know how to run a Linux server with all that that entails.
You’ll need to setup DNS for your domain at leas SPF, and DMARC, but probably also DKIM. Microsoft, Google, and Yahoo are all requiring DMARC and either SPF or DKIM to deliver your messages. SPF is simple. You just enumerate the IP addresses that you allow to send email from your domain. DKIM is a little harder. You setup a private-key, public-key pair; then for each message that you send, you extract a portion of it and you arrange for your email server to create a signature of the portion using your private key. You publish the public key in your DNS. People receiving your email from you can verify this signature and if it all works, they know that you are the actual sender of the email rather than a spammer.

Where we ended up

All of this generally works but my frustration stems from the fact that it does very little to reduce spam. For years, over 80% of the spam that I receive has had valid SPF and DKIM and I’m writing this today because yet another obvious phishing attempt was send to me. Of course, it passed SPF and DKIM with flying colors.

Thus we end up in a world of unintended consequences. Rather then the internet as envisioned, a large group of equally participating networks, we are slowly moving to a world where only Microsoft, Google, and Yahoo can deliver email.

October 7, 2023September 12, 2025

Git mirroring

Not many people chose to run their own gitlabs instance these days. My preference for self reliance means that I do. If you value self reliance I have recommendations:

Use ansible, chef, or puppet to build your gitlab instance because you are going to build two.
Build one gitlabs server for your groups consumption. Put this one in a data center close to your user for good performance.
Build a second gitlabs server in a remote location, perhaps at your favorite cloud provider. Where ever the second gitlabs instance is, you’ll want either one way or bidirection access via https or ssh between the two servers.
Follow the directions in your gitlabs: Help -> User Documentation -> Mirror a Repository. To mirror the repository from the primary to the secondary.

At this point, you’ve created a great plan B for disaster recovery in case something terrible happens to your gitlabs. For me, gitlabs is storing the Terraform and Ansible that I use to build my infrastructure. The goal is to be able to jumpstart your whatever from the mirror. I called the mirror my plan be because my plan A is to directly restore gitlabs from a nightly backup.

Setting up the mirror

Setting up the mirror is well documented. In broad strokes, here are the steps:

On the mirror, create a group and project to hold the mirrored repository.
Choose Push or Pull mirroring. In Push, the primary will push updates to the secondary as you work. In Pull, the mirror will periodically poll the primary for changes. You’ll have to decide what works best for you.
Fill out the form and perform any needed setup. When using push over ssh, this means setting up the primary to push and then copying the ssh public key from the primary and adding it as an allowed key on the mirror user.

As you configure mirroring, remember that constructing the mirror URL can be tricky especially if you want to use ssh as transport. This is because a typical git cloning string looks like this: git@git.example.com:group/project.git but the mirroring URL for this is: ssh://git@git.example.com/group/project.git. The difference lies between the server, git.example.com, and path. When cloning, the separator is a colon, ‘:’. When mirroring, it’s a slash, ‘/’. Getting the authentication can also be tricky. Mirroring more than a few repositories using SSH can become tricky because gitlabs generates a new ssh key for each repository. This is one place of the few places where I like git+https more than git+ssh. Finally, git+https is not without its pitfalls. If like me you also have your own CA then you have the additional problem that git doesn’t do a good job configuring curl’s CA. You have two choices here. On the box initiating the transfer, run: git config --global http.sslCAPath my-ca-path to use your CA dir or git config --global http.sslCAInfo my-ca-file.pem to configure your CA file. One advantage of git+https in this configuration is that you can create a single user token for all of your mirroring. My concern with git+ssh here is the proliferation of keys may eventually cause git to fail with a “too many authentication attempts” error. With git+http, you can create one mirroring token for all mirror operations.

Once you’ve setup mirroring, you have a great plan B if the day ever comes that your gitlabs server becomes unavailable. You should have a working gitlabs mirror that you can use in any way that you please. You can even pull backups of the mirror server so you have a redundant, offsite-backup.

Self hosted PKI Notes

If you run your own PKI you must install your PKI root certificates into the directory /etc/gitlab/trusted_certs and then run gitlab-ctl reconfigure. This is described in the gitlab documentation here. Failing to do this will result in mirroring errors because the gitlab server cannot establish a trusted TLS connection to mirror that certified by your private PKI.

September 13, 2023

When an ansible task fails

It’s been a frustrating week. If it can break, it has broken and lately I’ve been shining up my ansible to fix it. So I find myself trying to use my shiny new playbooks to address problems and to get my machines to all line up. Today my ansible-playbook ... run hung up on an arm based mini-nas that I have in my vacation house. My first assumption was that ansible was the problem That was wrong. To find the problem, I ran the playbook and then logged onto the machine seperately. A quick ps alx gave me this little snippet:

1001 43918 43917  2  52  0  12832  2076 pause    Is    1       0:00.03 -ksh (ksh)
   0 43943 43918  3  24  0  18200  6916 select   I     1       0:00.04 sudo su -
   0 43946 43943  2  26  0  13516  2776 wait     I     1       0:00.02 su -
   0 43947 43946  2  20  0  12832  2024 pause    S     1       0:00.03 -su (ksh)
   0 51594 43947  3  20  0  13464  2572 -        R+    1       0:00.01 ps alx
   0 51578 51527  2  52  0  12832  1980 pause    Is+   0       0:00.01 ksh -c /bin/sh -c '/usr/local/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1694615369.904476-9336-34642038817669/Ansib
   0 51579 51578  3  52  0  13536  2552 wait     I+    0       0:00.01 /bin/sh -c /usr/local/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1694615369.904476-9336-34642038817669/AnsiballZ_pkg
   0 51580 51579  3  40  0  36756 23668 select   I+    0       0:01.51 /usr/local/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1694615369.904476-9336-34642038817669/AnsiballZ_pkgng.py
   0 51582 51580  0  52  0  21388  9048 wait     I+    0       0:00.04 /usr/sbin/pkg update
   0 51583 51582  1  52  0  21708 10104 ttyin    I+    0       0:00.19 /usr/sbin/pkg update

This is relevant because because it traces the process tree from my ssh login all the down to the process that’s hung up. Note well that the pkg update run at PID 51583 is in a ttyin state. Running pkg update manually gave me this:

# pkg update
Updating FreeBSD repository catalogue...
Fetching packagesite.pkg: 100%    6 MiB   3.3MB/s    00:02
Processing entries:   0%
Newer FreeBSD version for package zziplib:
To ignore this error set IGNORE_OSVERSION=yes
- package: 1302001
- running kernel: 1301000
Ignore the mismatch and continue? [y/N]:

The why of all this doesn’t really matter much. In this case the machine is running a copy of FreeBSD that’s stale, 13.1, and pkgng is asking my permission to update to a package repository from FreeBSD 13.2. What’s important here is a basic debugging technique. The important question is: How does ansible actually work under the covers? The answer is, each ansible builtin prepares a 100k or so blob of python that it spits in …/.ansible/tmp on the remote machine. Then it uses the local python interpreter to run that blob. The python within the blob idempotently does the work. My blob needed to verify that the sudo package on my box. For reasons that I don’t understand but also really don’t mind, it wanted to make sure that the local package collection was up to date. It’s not normal for a box to hang on pkg update but it’s not crazy either.

September 1, 2023September 1, 2023

Python Modules: setup.py

I found the setup.py file for the pyvmomi module here: “https://github.com/vmware/pyvmomi/blob/master/setup.py“. It looks like a reasonable example of what to do in setup.py.

June 21, 2023October 7, 2023

On FreeBSD, git can’t find the certificate store

When I was playing with git checkout of modules I discovered that git doesn’t know how to set the certificate store for curl when it tries to retrieves a module via https. In general, I don’t recommend using git with https unless you have to. Using git+ssh obviates away a bucket of authentication issues. In this case, https is the better choice. To tell git where to look for certificates, to verify and https website, I had to add the following to my ~/.gitconfig:

[http]
sslCApath=/etc/ssl/certs

The command that does this is: git config --global http.sslCAPath "/etc/ssl/certs". If your operating system uses a CA file rather than a CA directory this is the setting: git config --global http.sslCAInfo "/etc/ssl/cert.pem". You can also make this work by setting an environment variable for curl in /etc/profile.

Mirroring in Gitlab

I normally strongly prefer git+ssh over git+https. If you are mirroring between two gitlab-ce instances over git+https, you can handle your mirroring with a single authentication token.

June 21, 2023June 23, 2023

Pip + git for development

I’m working on what should be a simple raspberry pi display project and I came up with the need for a set of ad-hoc python modules that were installable from my gitlab server. It was a bit of a journey. Here are the broad steps:

Create a gitlabs project for your python module. Since will probably have a few of these, it might be good to make a group for them right now.
I think that you can use git+ssh://git@gitlabs.example.com... for this but I chose to use a gitlabs impersonation token for this since ssh isn’t installed everywhere and sometimes the installation needs a bunch of hints in ~/.ssh/config.
A standard install will be done with pip as follows pip install git+https://{user}:{password}@git.example.com/example-group/example-project.git. If you created a gitlabs impersonation token about, you can substitute it for password here.

Sometimes I need to edit the installed package that I’m working on. The way to do this is to use the –editable flag to pip. To do that you need to specify some extra information to git when checking out the project. I found that this command line:

pip install --editable git+https://{user}:{token}@gitlabs.example.com/example-group/example-project.git#egg={module_name}

I think that the #egg={module_name} piece provides pip with name of the module as installed. I found the documentation that explains this here: “https://pip.pypa.io/en/stable/topics/vcs-support/”. Assuming that you are doing this in a venv, and it doesn’t make sense not to, you’ll get a new directory called venv/src/{module_name} which has a git checkout of your module so you can edit it to your needs for this particular project.

October 15, 2021October 15, 2021

Nuke and Pave

I recently reinstalled MacOS on my work and home laptops and then brough back my working state using Time Machine on both. I’m always impressed by how much faster and better a computer is after you do this. My friend Matt Zagaja: https://zagaja.com calls this a “Nuke and Pave” from here: https://www.macsparky.com/blog/2016/3/t0kcqkdxmkapwyo9eno0hv98ojd2kx and I love the term. In my opinion, one of the bad side effects of MacOS’ success is that you don’t have to *Nuke and Pave* very often. I think I’d been carrying my working environment forward for better than 10 years without a refresh and moving from High Sierra to Catalina added a bunch of unwanted quirkiness. This was probably because Apple is deprecating a bunch of the tools that I used in 2012 and while I don’t use them today, they were still installing kernel extensions and other stuff that was making my machine a little unstable. If you want to do your own *Nuke and Pave* on mac, you’ll need the following:

The operating system you want to install. I used Big Sur 11.6. I find that for MacOS you want to download the OS and then use instructions like these: https://support.apple.com/en-us/HT201372 to create usb install media.
If you use MacPorts see the notes at the end to save a list of the ports that your run. You’ll need it when you rebuild.
Backup media: If it’s important you should have one or two backups of it . In this case you want a Time Machine backup. Disk Clone style backups would normally be quicker but don’t give you the granularity you need here. I use a USB-C to NVMe drive enclosure for speed here. My second backup is on rotating rust.

The operation is pretty simple. You want to:

Boot your Mac from the USB installer by shutting down completely and then booting and pressing *Option* and holding it until your Mac presents you with a choice of boot media. It’s handy that newer Macs will boot on a keypress so you can start this process by simply pressing and holding *Option* If you are on Catalina or later you have to boot to _recovery mode_ first by shutting down your mac completely and using the utilities menu to enable booting from other media. If you have a firmware password on your Mac, you’ll need that to change this setting.
Once you’ve booted from your install media, you need to erase and repartition the hard drive on your Mac. This is the point of no return so don’t take this step unless you trust your backups.
Follow the install media instructions to reinstall MacOS on your computer. It will pause and ask you how to build users. What’s going on behind the scenes is the mac is using Migration Assistant to populate your home directory. Choose Time Machine backup and go into the menus and trim all of Applications, Settings, etc. You really only want to carry over data at this point. If you don’t migrate enough information, you can use Migration Assistant or Time Machine to catch anything that you missed.
Reinstall your apps using the App Store, and whatever other sources you have. As a developer I have a bunch of software installed that requires me to Control-Click on the Application and then give permission to run one time.
Restore security permissions as needed. App Store packages generally won’t have this problem. Other packages will. I use Emacs as my main editor because I’ve been doing this for a while. That requires me to go into the System Preferences -> Security & Privacy -> Privacy pane and grant Emacs permission to read files from my specified locations.

That’s most of what you need. I did the operation overnight. I handled steps 1 ~ 3 and then went to sleep. When I woke up I finished up 4 and 5.

A side note here for MacPorts or Homebrew users. You’ll want to restore your MacPorts/Homebrew environment also. For MacPorts this isn’t hard. Basically run sudo port list requested > ~/Desktop/ports-requested.txt This will leave a copy of the ports you installed by hand in a text file. When you are rebuilding your machine, you’ll need to perform the prerequisites needed to run MacPorts. Then you can use this output to install the packages that you used. I don’t use HomeBrew but I’ imagine that there must be something similar to this in HomeBrew.

March 2, 2021April 27, 2021

Git Sparse Checkout

At work we had a very large monorepo. I’m tempted to quote Douglas Adams here but the reference is good enough. Checking out the whole thing runs the possibility of confusing git status messages as a result changes in the other part of the tree. These messages are a distraction. Dealing with them can consume large amounts of time. The best way to avoid them perform what’s called a sparse checkout. This is a checkout that only puts what you need into your working directory. In a normal checkout:

 $ git clone ...

You get the entire code base in your working directory. A sparse checkout is more complicated to perform:

 $ mkdir _target directory_
 $ cd _target directory_
 $ git init .
 $ git config core.sparsecheckout true
 $ echo "_your desired subdir_" >> .git/info/sparse-checkout
 $ ## Repeat the echo for each directory you need.
 $ git remote add origin https://git.neopost.com/PPT/IBMHSM.git
 $ git fetch
 $ git checkout master

It’s eight steps but if you do it this way, you gain complete control over what’s in your working directory.