When an ansible task fails

It’s been a frustrating week. If it can break, it has broken and lately I’ve been shining up my ansible to fix it. So I find myself trying to use my shiny new playbooks to address problems and to get my machines to all line up. Today my ansible-playbook ... run hung up on an arm based mini-nas that I have in my vacation house. My first assumption was that ansible was the problem That was wrong. To find the problem, I ran the playbook and then logged onto the machine seperately. A quick ps alx gave me this little snippet:

1001 43918 43917  2  52  0  12832  2076 pause    Is    1       0:00.03 -ksh (ksh)
   0 43943 43918  3  24  0  18200  6916 select   I     1       0:00.04 sudo su -
   0 43946 43943  2  26  0  13516  2776 wait     I     1       0:00.02 su -
   0 43947 43946  2  20  0  12832  2024 pause    S     1       0:00.03 -su (ksh)
   0 51594 43947  3  20  0  13464  2572 -        R+    1       0:00.01 ps alx
   0 51578 51527  2  52  0  12832  1980 pause    Is+   0       0:00.01 ksh -c /bin/sh -c '/usr/local/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1694615369.904476-9336-34642038817669/Ansib
   0 51579 51578  3  52  0  13536  2552 wait     I+    0       0:00.01 /bin/sh -c /usr/local/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1694615369.904476-9336-34642038817669/AnsiballZ_pkg
   0 51580 51579  3  40  0  36756 23668 select   I+    0       0:01.51 /usr/local/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1694615369.904476-9336-34642038817669/AnsiballZ_pkgng.py
   0 51582 51580  0  52  0  21388  9048 wait     I+    0       0:00.04 /usr/sbin/pkg update
   0 51583 51582  1  52  0  21708 10104 ttyin    I+    0       0:00.19 /usr/sbin/pkg update

This is relevant because because it traces the process tree from my ssh login all the down to the process that’s hung up. Note well that the pkg update run at PID 51583 is in a ttyin state. Running pkg update manually gave me this:

# pkg update
Updating FreeBSD repository catalogue...
Fetching packagesite.pkg: 100%    6 MiB   3.3MB/s    00:02
Processing entries:   0%
Newer FreeBSD version for package zziplib:
To ignore this error set IGNORE_OSVERSION=yes
- package: 1302001
- running kernel: 1301000
Ignore the mismatch and continue? [y/N]: 

The why of all this doesn’t really matter much. In this case the machine is running a copy of FreeBSD that’s stale, 13.1, and pkgng is asking my permission to update to a package repository from FreeBSD 13.2. What’s important here is a basic debugging technique. The important question is: How does ansible actually work under the covers? The answer is, each ansible builtin prepares a 100k or so blob of python that it spits in …/.ansible/tmp on the remote machine. Then it uses the local python interpreter to run that blob. The python within the blob idempotently does the work. My blob needed to verify that the sudo package on my box. For reasons that I don’t understand but also really don’t mind, it wanted to make sure that the local package collection was up to date. It’s not normal for a box to hang on pkg update but it’s not crazy either.

On FreeBSD, git can’t find the certificate store

When I was playing with git checkout of modules I discovered that git doesn’t know how to set the certificate store for curl when it tries to retrieves a module via https. In general, I don’t recommend using git with https unless you have to. Using git+ssh obviates away a bucket of authentication issues. In this case, https is the better choice. To tell git where to look for certificates, to verify and https website, I had to add the following to my ~/.gitconfig:

[http]
sslCApath=/etc/ssl/certs

Pip + git for development

I’m working on what should be a simple raspberry pi display project and I came up with the need for a set of ad-hoc python modules that were installable from my gitlab server. It was a bit of a journey. Here are the broad steps:

  • Create a gitlabs project for your python module. Since will probably have a few of these, it might be good to make a group for them right now.
  • I think that you can use git+ssh://git@gitlabs.example.com... for this but I chose to use a gitlabs impersonation token for this since ssh isn’t installed everywhere and sometimes the installation needs a bunch of hints in ~/.ssh/config.
  • A standard install will be done with pip as follows pip install git+https://{user}:{password}@git.example.com/example-group/example-project.git. If you created a gitlabs impersonation token about, you can substitute it for password here.

Sometimes I need to edit the installed package that I’m working on. The way to do this is to use the –editable flag to pip. To do that you need to specify some extra information to git when checking out the project. I found that this command line:

pip install --editable git+https://{user}:{token}@gitlabs.example.com/example-group/example-project.git#egg={module_name}

I think that the #egg={module_name} piece provides pip with name of the module as installed. I found the documentation that explains this here: “https://pip.pypa.io/en/stable/topics/vcs-support/”. Assuming that you are doing this in a venv, and it doesn’t make sense not to, you’ll get a new directory called venv/src/{module_name} which has a git checkout of your module so you can edit it to your needs for this particular project.

Nuke and Pave

I recently reinstalled MacOS on my work and home laptops and then brough back my working state using Time Machine on both. I’m always impressed by how much faster and better a computer is after you do this. My friend Matt Zagaja: https://zagaja.com calls this a “Nuke and Pave” from here: https://www.macsparky.com/blog/2016/3/t0kcqkdxmkapwyo9eno0hv98ojd2kx and I love the term. In my opinion, one of the bad side effects of MacOS’ success is that you don’t have to *Nuke and Pave* very often. I think I’d been carrying my working environment forward for better than 10 years without a refresh and moving from High Sierra to Catalina added a bunch of unwanted quirkiness. This was probably because Apple is deprecating a bunch of the tools that I used in 2012 and while I don’t use them today, they were still installing kernel extensions and other stuff that was making my machine a little unstable. If you want to do your own *Nuke and Pave* on mac, you’ll need the following:

  1. The operating system you want to install. I used Big Sur 11.6. I find that for MacOS you want to download the OS and then use instructions like these: https://support.apple.com/en-us/HT201372 to create usb install media.
  2. If you use MacPorts see the notes at the end to save a list of the ports that your run. You’ll need it when you rebuild.
  3. Backup media: If it’s important you should have one or two backups of it . In this case you want a Time Machine backup. Disk Clone style backups would normally be quicker but don’t give you the granularity you need here. I use a USB-C to NVMe drive enclosure for speed here. My second backup is on rotating rust.

The operation is pretty simple. You want to:

  1. Boot your Mac from the USB installer by shutting down completely and then booting and pressing *Option* and holding it until your Mac presents you with a choice of boot media. It’s handy that newer Macs will boot on a keypress so you can start this process by simply pressing and holding *Option* If you are on Catalina or later you have to boot to _recovery mode_ first by shutting down your mac completely and using the utilities menu to enable booting from other media. If you have a firmware password on your Mac, you’ll need that to change this setting.
  2. Once you’ve booted from your install media, you need to erase and repartition the hard drive on your Mac. This is the point of no return so don’t take this step unless you trust your backups.
  3. Follow the install media instructions to reinstall MacOS on your computer. It will pause and ask you how to build users. What’s going on behind the scenes is the mac is using Migration Assistant to populate your home directory. Choose Time Machine backup and go into the menus and trim all of Applications, Settings, etc. You really only want to carry over data at this point. If you don’t migrate enough information, you can use Migration Assistant or Time Machine to catch anything that you missed.
  4. Reinstall your apps using the App Store, and whatever other sources you have. As a developer I have a bunch of software installed that requires me to Control-Click on the Application and then give permission to run one time.
  5. Restore security permissions as needed. App Store packages generally won’t have this problem. Other packages will. I use Emacs as my main editor because I’ve been doing this for a while. That requires me to go into the System Preferences -> Security & Privacy -> Privacy pane and grant Emacs permission to read files from my specified locations.

That’s most of what you need. I did the operation overnight. I handled steps 1 ~ 3 and then went to sleep. When I woke up I finished up 4 and 5.

A side note here for MacPorts or Homebrew users. You’ll want to restore your MacPorts/Homebrew environment also. For MacPorts this isn’t hard. Basically run sudo port list requested > ~/Desktop/ports-requested.txt This will leave a copy of the ports you installed by hand in a text file. When you are rebuilding your machine, you’ll need to perform the prerequisites needed to run MacPorts. Then you can use this output to install the packages that you used. I don’t use HomeBrew but I’ imagine that there must be something similar to this in HomeBrew.

Git Sparse Checkout

Git Sparse Checkout

At work we had a very large monorepo. I’m tempted to quote Douglas Adams here but the reference is good enough. Checking out the whole thing runs the possibility of confusing git status messages as a result changes in the other part of the tree. These messages are a distraction. Dealing with them can consume large amounts of time. The best way to avoid them perform what’s called a sparse checkout. This is a checkout that only puts what you need into your working directory. In a normal checkout:

 $ git clone ...

You get the entire code base in your working directory. A sparse checkout is more complicated to perform:

 $ mkdir _target directory_
 $ cd _target directory_
 $ git init .
 $ git config core.sparsecheckout true
 $ echo "_your desired subdir_" >> .git/info/sparse-checkout
 $ ## Repeat the echo for each directory you need.
 $ git remote add origin https://git.neopost.com/PPT/IBMHSM.git
 $ git fetch
 $ git checkout master

It’s eight steps but if you do it this way, you gain complete control over what’s in your working directory.

OpenBSD’s ksh adds configurable tab completion

I saw a configuration for bash tab completion a few years ago and I’ve always wanted it for the korn shell. I use either the “true” ksh from AT&T via David Korn or one of the variants that has sprung out of the pdksh project. OpenBSD’s ksh is a descendant of pdksh. In a recent release of OpenBSD someone patched it to kludge configurable tab completion via environmental arrays. The article is here: https://www.vincentdelft.be/post/post_20210102

This ksh is shells/oksh in FreeBSD.

Everything needs a reset button

I don’t think I’m an Apple fan boy but I definitely like their stuff. That said, everyone could stand some improvement. One place Apple could improve is resets, and general controls.

The Story

I used to run a desktop machine 24/7 as an NFS server for backups. During the summer this heats up my office almost to the point where I can’t use it. I decided to change the Desktop to only be on when I need and to be suspended or off when idle. something has to be a target for the backups though. I chose a Mac Mini that I had. I brought the mini to my office and refreshed the machine with a new Operating System. Then I moved the NFS service to it. Finally, I turned off the desktop.

The Mini runs headless. At somepoint, Apple decided to be “helpful”. they changed OS X so that machines without any keyboard at all will plead for a bluetooth keyboard and mouse at periodically. The keyboard that connected to my work laptop obliged and connected and paired without a pin code exchange.

This morning I spent 20 minutes of my time trying to figure out why my keyboard didn’t work before I realized that pairing without a pin code was even possible.

I get that pairing in this manner reduces support calls and I understand that my use case of a Mac Mini running headless is a corner case by far but decisions that theOS X UI team have made lately really don’t help me. I disconnected the keyboard and turned off bluetooth on the mini but I’ll be surprised if that’s the end of things. Apple’s in the midst of a war on wires these days so if you connect your computer using ethernet and turn of wifi, your machine will pester you forever to turn the wifi back on. I’ll see if bluetooth is the same.