An Recipe for Removal of Snapshots Stuck in Allocated, BackingUp states

logo

Sometimes VM volume snapshots in ACS can stuck in exotic states like ‘Allocated’, ‘BackingUp’. They can not be removed through API in those states. Unfortunately, the problem occurs from time to time in every ACS version and last time we met it in ACS 4.9.2. Administrator must remove them somehow from ACS. The article provides the step-by-step guide how to address the problem.

So, this is not a regular case. To solve it administrator must use direct database modification for ACS database. Operation should be executed in three steps:

  1. change the state of an snapshot which is in wrong state;
  2. remove the snapshot through API (or UI);
  3. check data removed from image store.

Fixing it

First, let’s find snapshot in wrong states:

mysql> select * from snapshots where status != 'BackedUp' and status != 'Destroyed' and status != 'Error';

+-----+----------------+------------+-----------+-----------+------------------+-----------+------+-----------------------------------------------------------------+--------------------------------------+---------------+------------------+-------------+---------------------+---------------------+----------------+----------+------------+--------------+-----------------+---------+-------+----------+----------+
| id  | data_center_id | account_id | domain_id | volume_id | disk_offering_id | status    | path | name                                                            | uuid                                 | snapshot_type | type_description | size        | created             | removed             | backup_snap_id | swift_id | sechost_id | prev_snap_id | hypervisor_type | version | s3_id | min_iops | max_iops |
+-----+----------------+------------+-----------+-----------+------------------+-----------+------+-----------------------------------------------------------------+--------------------------------------+---------------+------------------+-------------+---------------------+---------------------+----------------+----------+------------+--------------+-----------------+---------+-------+----------+----------+
|  66 |              1 |          4 |         1 |       376 |                1 | Allocated | NULL | VM-aab02612-8b9b-4274-b727-3505558995b3_ROOT-336_20170117082532 | c35af6d6-eb89-406c-a06a-e3ab7da48c67 |             0 | MANUAL           |  8589934592 | 2017-01-17 08:25:32 | NULL                | NULL           |     NULL |       NULL |         NULL | KVM             | 2.2     |  NULL |     NULL |     NULL |
| 729 |              1 |          2 |         1 |       607 |               18 | Creating  | NULL | DONTDELETEORYOUWILLBEFIRED_ROOT-523_20170823050255              | 83549abf-7bdd-4a75-bdaa-66c01426ab2d |             3 | HOURLY           | 53687091200 | 2017-08-23 05:02:55 | 2017-08-23 05:10:44 | NULL           |     NULL |       NULL |         NULL | KVM             | 2.2     |  NULL |     NULL |     NULL |
| 879 |              1 |          2 |         1 |       607 |               18 | BackingUp | NULL | DONTDELETEORYOUWILLBEFIRED_ROOT-523_20170828070114              | 0d6e4ef3-1275-4117-8721-cf35e6211de5 |             3 | HOURLY           | 53687091200 | 2017-08-28 07:01:14 | NULL                | NULL           |     NULL |       NULL |         NULL | KVM             | 2.2     |  NULL |     NULL |     NULL |
+-----+----------------+------------+-----------+-----------+------------------+-----------+------+-----------------------------------------------------------------+--------------------------------------+---------------+------------------+-------------+---------------------+---------------------+----------------+----------+------------+--------------+-----------------+---------+-------+----------+----------+

Keep in mind, the list can include extra snapshots which are created currently, so consider use of created field filtering in real cases

Change their state to “BackedUp”:

mysql> update snapshots set status = 'BackedUp' where id=879 or id=66;

Removal

Now, after fixing the state, remove the snapshots with API or UI.

Data Removal From Image Store

Ensure, data wiped from a secondary storage (example for Snapshot with Id=879):


mysql> select snapshot_id, url, store_role, install_path from snapshot_store_ref,image_store where image_store.id=store_id and snapshot_id=879 and store_role = 'Image';
+-------------+--------------------------------------+------------+-----------------+
| snapshot_id | url                                  | store_role | install_path    |
+-------------+--------------------------------------+------------+-----------------+
|         879 | nfs://192.168.3.218/export/secondary | Image      | snapshots/2/607 |
+-------------+--------------------------------------+------------+-----------------+
1 row in set (0.00 sec)

Now, let’s visit a storage server nfs://192.168.3.218/export/secondary and ensure that the path snapshots/2/607 doesn’t have the data.

Problem Sources

The problem might happen during snapshot creation in several cases, the popular are:

  1. network connectivity failures;
  2. management server failure;
  3. hardware performance degradation or failure.

Long-term Solution

For a long-term solution an automated script can be deployed which determines the situation and notifies ITSM about it (or even cleans automatically).