There will eventually be a KB for this, but in the meantime I wanted to show how you can clean up PKS created NSX-T resources in NSX-T manager. You may need to do this when a PKS cluster fails to create or delete properly, and doing it manually is tedious and error prone.
Please make sure you have valid backups of the NSX-T manager before proceeding.
Installation
- curl -LO https://storage.googleapis.com/pks-releases/pks_cleanup_linux
- chmod +x pks_cleanup_linux
- sudo mv pks_cleanup_linux /usr/local/bin/pks_cleanup
Execution
You can use the utility’s help system to see all of the options but the following shows an example run and has the required options:
pks_cleanup –mgr-ip=192.168.111.46 \
–username=admin \
–password=VMware1! \
–cluster=pks-18ef47d8-d4ac-4d6c-9d77-301860c3a98f \
–read-only=false \
–pks \
–floating-ip-pool-id=5a35b05c-70d4-4337-9f8e-b8b8533476c7 \
–ip-block-id=d5aab712-4b83-4690-a16f-f6a3583c9056
- mgr-ip is your NSX-T manager
- username and password for NSX-T manager
- cluster is pks followed by the PKS cluster UUID you’re cleaning up
- Setting read-only to true will show you what will be deleted but won’t actually delete anything.
- pks specifies that you want to delete PKS created resources
- floating-ip-pool-id is defined in NSX-T manager > Inventory > Groups > IP Pools
- ip-block-id is the master/worker node IP pool defined in NSX-T manager > DDI > IPAM
You can ignore the following messages:
ResourceDeleteFunc(): unrecognized resource type: TIER0
ResourceCollectFunc(): unrecognized resource type: NatRule
ResourceDeleteFunc(): unrecognized resource type: NatRule
Script
Since a lot of the parameters will usually be the same, I created this script so that you can just specify the PKS cluster UUID and the read-only mode.
Create file named pks-cleanup.sh and paste in the contents below. You’ll need to adjust the NSX constants at the beginning to match your environment:
#!/usr/bin/env bash NSX_MANAGER_USERNAME=admin NSX_MANAGER_PASSWORD=VMware1! NSX_MANAGER_IP=192.168.100.110 PKS_CLUSTER_UUID=$1 READ_ONLY=$2# FLOATING_IP_POOL_ID is LB pool defined in NSX-T manager > Inventory > Groups > IP Pools FLOATING_IP_POOL_ID=725ed0d6-c197-4b2b-ac5e-8c4981caa5fb# IP_BLOCK_ID is the node ip pool defined in NSX-T manager > DDI > IPAM IP_BLOCK_ID=ad51f33b-e7ae-45f5-81dd-fd481177f1dc# Usage: # Read-only mode: ./pks-cleanup.sh <PKS_CLUSTER_UUID> true # Delete mode: ./pks-cleanup.sh <PKS_CLUSTER_UUID> falsepks_cleanup --mgr-ip ${NSX_MANAGER_IP} \ --username ${NSX_MANAGER_USERNAME} \ --password ${NSX_MANAGER_PASSWORD} \ --cluster "pks-${PKS_CLUSTER_UUID}" \ --read-only=${READ_ONLY} \ --pks \ --floating-ip-pool-id ${FLOATING_IP_POOL_ID} \ --ip-block-id ${IP_BLOCK_ID}
Make the script executable
chmod +x pks-cleanup.sh
Run in read-only mode to see what would be deleted in NSX-T
./pks-cleanup.sh <PKS_CLUSTER_UUID> true
Run in write mode to delete items in NSX-T
./pks-cleanup.sh <PKS_CLUSTER_UUID> false
Hello,
I seem to be unable to use the utility successfully with read_only set to false. When set to true, the I see a large number of lines showing what will be deleted. When run with false, the output I get is as below. I’ve tried it with a few abandoned cluster ids (ip addr obscured)
./pks-cleanup-lab.sh f1cd2c13-cbdd-4714-94db-7038949283b7 false
failed to cleanup pks created resources [POST /pools/ip-pools/{pool-id}][400] allocateOrReleaseFromIpPoolBadRequest &{ErrorCode:5110 ErrorMessage:IP Address 10.x.x.x does not belong to any of the existing ranges in the pool with id IpPool/cc591217-64be-42c1-b795-58105022eac3. ModuleName:id-allocation service}
Might you know what I am doing wrong?
Thanks – Mark
What version of PKS are you using? I don’t remember when but I think around 1.2 you no longer need to run this script. The “pks delete-cluster” command should take care of it. If that’s not working, make sure that when you run “bosh vms” it doesn’t show the deployment. If you see the deployment, you can run “bosh -d delete-deployment”. If that doesn’t work and leaves VMs around you can run “bosh -d delete-deployment –force”. Once you no longer see the deployment in the output of “bosh vms” you can run “pks delete-cluster” again and it should say something like the cluster has already been deleted and will then talk to NSX-T manager to delete all the old objects.