Sometimes VM volume snapshots in ACS can stuck in exotic states like ‘Allocated’, ‘BackingUp’. They can not be removed through API in those states. Unfortunately, the problem occurs from time to time in every ACS version and last time we met it in ACS 4.9.2. Administrator must remove them somehow from ACS. The article provides the step-by-step guide how to address the problem.
So, this is not a regular case. To solve it administrator must use direct database modification for ACS database. Operation should be executed in three steps:
- change the state of an snapshot which is in wrong state;
- remove the snapshot through API (or UI);
- check data removed from image store.
Fixing it
First, let’s find snapshot in wrong states:
mysql> select * from snapshots where status != 'BackedUp' and status != 'Destroyed' and status != 'Error';
+-----+----------------+------------+-----------+-----------+------------------+-----------+------+-----------------------------------------------------------------+--------------------------------------+---------------+------------------+-------------+---------------------+---------------------+----------------+----------+------------+--------------+-----------------+---------+-------+----------+----------+
| id | data_center_id | account_id | domain_id | volume_id | disk_offering_id | status | path | name | uuid | snapshot_type | type_description | size | created | removed | backup_snap_id | swift_id | sechost_id | prev_snap_id | hypervisor_type | version | s3_id | min_iops | max_iops |
+-----+----------------+------------+-----------+-----------+------------------+-----------+------+-----------------------------------------------------------------+--------------------------------------+---------------+------------------+-------------+---------------------+---------------------+----------------+----------+------------+--------------+-----------------+---------+-------+----------+----------+
| 66 | 1 | 4 | 1 | 376 | 1 | Allocated | NULL | VM-aab02612-8b9b-4274-b727-3505558995b3_ROOT-336_20170117082532 | c35af6d6-eb89-406c-a06a-e3ab7da48c67 | 0 | MANUAL | 8589934592 | 2017-01-17 08:25:32 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
| 729 | 1 | 2 | 1 | 607 | 18 | Creating | NULL | DONTDELETEORYOUWILLBEFIRED_ROOT-523_20170823050255 | 83549abf-7bdd-4a75-bdaa-66c01426ab2d | 3 | HOURLY | 53687091200 | 2017-08-23 05:02:55 | 2017-08-23 05:10:44 | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
| 879 | 1 | 2 | 1 | 607 | 18 | BackingUp | NULL | DONTDELETEORYOUWILLBEFIRED_ROOT-523_20170828070114 | 0d6e4ef3-1275-4117-8721-cf35e6211de5 | 3 | HOURLY | 53687091200 | 2017-08-28 07:01:14 | NULL | NULL | NULL | NULL | NULL | KVM | 2.2 | NULL | NULL | NULL |
+-----+----------------+------------+-----------+-----------+------------------+-----------+------+-----------------------------------------------------------------+--------------------------------------+---------------+------------------+-------------+---------------------+---------------------+----------------+----------+------------+--------------+-----------------+---------+-------+----------+----------+
Keep in mind, the list can include extra snapshots which are created currently, so consider use of
created
field filtering in real cases
Change their state to “BackedUp”:
mysql> update snapshots set status = 'BackedUp' where id=879 or id=66;
Removal
Now, after fixing the state, remove the snapshots with API or UI.
Data Removal From Image Store
Ensure, data wiped from a secondary storage (example for Snapshot with Id=879):
mysql> select snapshot_id, url, store_role, install_path from snapshot_store_ref,image_store where image_store.id=store_id and snapshot_id=879 and store_role = 'Image';
+-------------+--------------------------------------+------------+-----------------+
| snapshot_id | url | store_role | install_path |
+-------------+--------------------------------------+------------+-----------------+
| 879 | nfs://192.168.3.218/export/secondary | Image | snapshots/2/607 |
+-------------+--------------------------------------+------------+-----------------+
1 row in set (0.00 sec)
Now, let’s visit a storage server nfs://192.168.3.218/export/secondary
and ensure that the path
snapshots/2/607
doesn’t have the data.
Problem Sources
The problem might happen during snapshot creation in several cases, the popular are:
- network connectivity failures;
- management server failure;
- hardware performance degradation or failure.
Long-term Solution
For a long-term solution an automated script can be deployed which determines the situation and notifies ITSM about it (or even cleans automatically).