Share IT!: [lunar.lab] Allow TKGm Workload Cluster to Pull Image from Harbor Configured with Self-signed Certificate

Disclaimer

This method is kind of a hack and hence ** Unsupported **.
I do this only within my lab or PoC with controlled environment.

Problem Statement

TKGm Workload Cluster do not allow pulling image from Container Registry configured with Self-signed Certificate.

Doing such thing will throw error message as follows:

x509: certificate signed by unknown authority

Context

I deployed Harbor Registry as local container registry in my lab and configure it with self-signed certificate. More about it can be found in this post: https://bit.ly/3NndGBT.

In Step-3 from the said post, self-signed certificate for Harbor created and stored in my Bootstrap Machine (aka. my Harbor Registry) as:

/etc/docker/certs.d/harbor-01a.corp.local/ca.crt

During Bootstrap Machine configuration which documented in this post: https://bit.ly/3x2Yhkm, In Step 3, I created SSH Key Pair and stored in Bootstrap Machine as

~/.ssh/id_rsa

The above Harbor certificate and SSH Private Key will be required to do the steps explained below. I do all the steps from Bootstrap Machine, so the certificate and private key already there.

Consequences

The following steps are manual configuration performed to each Kubernetes nodes. Scaling Kubernetes cluster, or any operations which causing the node to be redeployed, requires the procedure to be applied to the new node(s).

Procedure

I modify the procedure explained by Cormac Hogan in this article: https://cormachogan.com/2020/06/23/integrating-embedded-vsphere-with-kubernetes-harbor-registry-with-tkg-guest-clusters/. The original article explain how to do the procedure on vSphere with Tanzu guest cluster (TKGs), while my environment is TKGm. The other difference is my TKGm use containerd as container runtime, as opposed to docker in the original article.

The procedure quite simple, using your targeted TKGm workload cluster context, retrieve the IP Address from all nodes with this command:

for i in `kubectl get nodes --no-headers | awk '{print $1}'`
do
kubectl get node $i -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' >> ip-list
echo >> ip-list
done

I then check the IP Addresses retrieved successfully.

cat ip-list

Then run the following command to:

Login to each node using private key
Transfer Harbor Certificate to each node and move it to correct location
Restart containerd service

for i in `cat ip-list`
do
scp -i /home/holuser/.ssh/id_rsa /etc/docker/certs.d/harbor-01a.corp.local/ca.crt capv@${i}:/home/capv/registry_ca.crt
ssh -i /home/holuser/.ssh/id_rsa capv@${i} \
'sudo bash -c "cat /home/capv/registry_ca.crt >> /etc/pki/tls/certs/ca-bundle.crt"'
ssh -i /home/holuser/.ssh/id_rsa capv@${i} 'sudo systemctl restart containerd'
done

Note that I point to SSH Private Key and Harbor Certificate mentioned in the Context above. Also note that I use the user capv to get access to TKGm workload cluster node (Reference: https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.5/vmware-tanzu-kubernetes-grid-15/GUID-troubleshooting-tkg-tips.html#connect-to-cluster-nodes-with-ssh-11).

Now to the moment of truth. I already create a project called mydemo in my Harbor Registry and push an Nginx image call tanzuweb. I created a deployment using that image and expose it using LoadBalancer service.

kubectl create deployment tanzuweb --image="harbor-01a.corp.local/mydemo/tanzuweb:0.1"
kubectl expose deployment tanzuweb --port=80 --name=tanzuweblb --type=LoadBalancer

And this is the result.