Hi, We are having some strange issues with an MSA P2000 SAS and VMWare and wondered if somebody had seen similar problems before.
Our environment is as follows:
HP MSA P2000 G3 SAS – firmware TS251P006-02
2 * HP servers used as ESX hosts (ESXi 5.5) directly attached via SAS to the MSA. Each server is connected to each controller.
We initially had a single RAID-5 vdisk, with a single volume provisioned to ESX. This has worked fine for years. The hosting team use thin provisioning at a VMware level. Recently they have had problems as it appeared they had run out of actual capacity due to the thin provisioning. This caused problems with the VM’s as you’d expect. It appeared that this was purely a capacity management issue.
The hosting team worked to free up some space and have managed to get the VM’s on-line. There is now around 400GB free in the datastore according to VMWare however they are still seeing disk capacity issues when trying to provision VM’s. We assumed this was a problem with VMWare not freeing up the now unused thin provisioned space…. However ..
In the background, an additional 4 disks were purchased to allow us to create a new vdisk and present additional capacity to VMWare. We took the decision to create a new RAID-5 Vdisk rather than to extend the existing one as it seemed like the safer option. The vdisk was created and a single volume presented to the ESX hosts (roughly 1.3TB). The ESX hosts detected the volume and a datastore was created all without issue. We assumed our job was done. Unfortunately even though the datastore was created without issue, the hosting team are unable to migrate any VM’s to it. Migrations will start and then fail with an ‘insufficient disk space’ error.
Maybe coincidentally, on the day the majority of the issues started, the Compact flash card was replaced in one of the MSA controllers due to a failure. Since the Compact Flash card replacement, the array is up and healthy.
We have tried the following:
Migrating VM’s from the old datastore to the new.
Creating new thick provisioned VM’s on the new datastore.
Creating new thin provisioned VM’s on the new datastore.
Rebooted each of the MSA controllers one at a time (to avoid an outage rather than a whole array reboot).
Rebooted both ESX hosts.
Disabled VAAI within ESX (the HP specific VAAI plug-in doesn’t seem to be installed)
Provisioned a smaller volume.
Created the datastore as VMFS3 rather than VMFS5.
Logged into the ESX hosts via SSH and tried to copy into the new datastore.
During all testing we get some data copied and then the ‘insufficient disk space’ or similar error. The amount of data copied seems to vary but it’s never very much.
We are at a loss to understand what could be happening as the system has worked fine for a number of years. We’re not even sure where the problem lies … VMWare/MSA etc. As there is some data that can be copied and the fact that you can create the datastore, it feels like the volume is writeable but maybe the communications is being interupted.
Any help/suggestions would be much appreciated!