Snapshot consolidation failed with error

Virtual machine was showing following alert:

Virtual machine disk consolidation is needed.  Virtual machine consolidation needed status

After trying to consolidate the virtual machine snapshots I received the following error:

Status: An error occurred while consolidating disks: Could not open/create change tracking file

Cannot complete the operation because the file or folder ds:///vmfs/volumes/*.vmdk already exists

Troubleshooting:
SSH to ESXi host and browse the volume where the VM resides on.
cd into the VM folder and found flat files for each of the vmdk disks but the VM does not show that any snapshots exists.
Tried to vMotion the VM and received same error
Tried to storage vMotion the VM and received same error
Tried creating a new snapshot and delete it again but still could not consolidate afterwards
Found the following KB 2013520 which outlines the same problems but just for committing a snapshot that already exists.
The issue seems to exist due to corrupted CTK files.  These files are associated with each delta disk and flat file for the virtual machine.
Resolution:
  • Power off the VM
  • Create a temp folder in the VM folder on the datastore
  • Move all the CTK files into the temporary folder.  The files names will look like this “*-ctk.vmdk”
  • Right click the VM and select snapshot -> Consolidate
On a side note:
After doing some further investigation I found that the there was a very large snapshots on this VM (> 150Gb) and after committing the snapshot it seems to have corrupted the CTK files.  
CTK file description
Always nice learning something new each day 🙂

The CTK files describe the VMDK characteristics/attributes of which the actual ‘data’ is stored in the in the *-flat.vmdk file.

Like in our case if you have multiple disks attached to a VM, then you’ll have multiple descriptor files that link to their actual disks.  You can actually view the contents of the file  which provides information such as chain IDs, the type of VMDK, it’s data disk, hw version, etc.
These files are also used for change block tracking (CBT) and VMware has a good KB to read on this:
Links:

Commvault – Backup Copy of snapshot for Exchange 2010

Been running into an interesting scenario with a backup process which I hope will help someone else.
I have an exchange backup of DAG with intellisnap which creates a snapshot of each of the exchange database volume on Netapp.  The snapshot is then mounted and copied via SAN transport on the backend fiber connection to the storage library on commvault agent.
This has a storage policy to perform a backup copy of the latest snap via SAN transport weekly on Thursdays.
With commvault I find that sometimes the snapshots are not copied to the commvault storage library, without any type of error.
In viewing the jobs for the primary copy of snaphost i just shows a status of “to be copied”, but no failure.
This however causes problems since if you do not pay attention you can sit with weeks of uncopied backup copies of the snapshots, and if you snapshots are set for retention of a few days then you loose all those copies and have to mark them bad! (not good!)
Have not find a way to resolve this, even with the help of Commvault support, but here are a few tips to make sure you are at least inform in more detail regarding this process:
  • A good indication that the backup copy is not taking place is when the snapshots on netapp volume is not being deleted.
  • Also make sure you setup a very detailed report for the day after the snapshot copy was to take place and make sure the backup copy detailed information is also included in this report.

If however you do run into this problem, and your snapshots have already been rotated on the volumes on storage you would have to perform the following tasks to continue the backup copy process.
(I recommend contacting Commvault support if you are not familiar with your storage, snapshots and intellisnap, the steps below are provided for use at your own risk, you will loose backup data with these steps)
  • Verify you oldest snapshot date which are still on volumes for databases
  • Select storage policy
  • Right click and primary copy and select view -> jobs
  • this will list all the backup copies and show the ones with status “to be copied”.
  • Select all jobs up to before the date you have for snapshots on Netapp volume.
  • Right click and select “Mark job bad” (i know not my favorite step)
  • Right click on the storage policy again and select All Tasks -> Run Backup copy.
  • This will start the backup copy of your latest snapshot which resides on backup storage volume
Another solution is to manually add snapshots taken during the week to be added for backup copies.

  • Right click on the storage policy again and select View -> Jobs
  • Right click on the snap you want to create backup copy off.
  • Select “Pick for backup copy”
  • As a note, you cannot create a backup copy of snapshot which is older than your latest backup copy. so make sure you backup your oldest snapshot first . 
Anybody else running into the same problem would love to hear from you.