ESXi host mappings unexpectedly removed from SAN

A storage administrator mistakenly removed the VMware ESXi host mappings which were still mapped and being used in Production environment from the SAN. This of course impacted live services and I was the lucky person on call. I’ve posted this because I’d hoped that simply readding the host mappings on the storage might resolve the issue but found it wasn’t the case. After adding the mappings again, it seemed random which LUNs were showing as mapped devices and which could access data so after attempting a few methods of reconfiguring the storage, I discovered a process which although left VMs in a mixture unavailable and invalid states, successfully presented the storage back to all the hosts as expected.

TLDR: This blog post shows the steps I took to get the VMs working as expected but the TLDR is any VMs which were powered on during the unexpected removal of host mappings must be manually powered off from the individual hosts, even though the VMs aren’t actually accessible.

So, 1) Power off VMs which are still being presented as powered on 2) Rescan the storage 3) Wait and all devices and datastores will become available again

Logging on after receiving the call out, I found the vCenter to be inaccessible so logged on to the Host Clients to manage this from an individual host level. VMs were showing as Powered On but I’d get an error when accessing the console, view details or change settings on the VMs. There were no errors on the Datastores but I couldn’t navigate the file structure so this points to storage.

VMs showing as Powered On
Error when opening Edit Settings

I moved on to our IBM SVC but the only alert was for LDAP which would’ve been caused by the domain controller VM being down. Datastores presented to the vSphere environments were showing Online but looking in to their configuration showed there were no host mappings. Logs showed the mappings being manually removed so on to resolution….

I hoped readding the host mappings which were incorrectly removed from the IBM SVC GUI would present the datastores back to the hosts exactly as they were before the mappings were removed but unfortunately it wasn’t the case and at first glance, gave some inconsistent results as different hosts could access different datastores. This is one of our smaller environments in the process of being decommissioned so I was able to quickly work through some troubleshooting and reconfiguration in an attempt to get the datastores available.

One of the hosts only had one vLCS VM running and only one of the LUNs didn’t present. I powered off the single VM and performed troubleshooting in other areas but when I returned., all LUNs were present. This proved the datastores which contained VMs marked as Powered On before the mappings were removed were actually preventing them being presented back to the host.

I’d noted which servers were showing as a Powered On state before troubleshooting so powered off all VMs by Selecting all VMs > Right Click > Power > Power Off

Navigated to the Storage section > Click Rescan and refreshed the storage devices

Once the rescan had completed, I could see the Devices show as Normal instantly but the Datastores took a couple of minutes to return as expected with capacity and details present

Once the host with the vCenter appliance on could access the storage, it allowed me to access the vCenter and confirm VM and storage states to ensure services were back running as expected. It’s worth noting the VMs came back in different powered on and off states so I referred back to the notes I’d taking earlier to ensure the servers were in the correct power state.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s