About VSS Errors:
VSS (Volume Shadowcopy Services) is a core component of windows that permits backup operators like Unitrends to safely protect a windows system while it is running. If the VSS service cannot operate properly, it will produce specific errors visible to and collected by Unitrends, as well as log events in the Windows Event Viewer, and result in backup failure.
VSS failures can sometimes be trivially solved and are transient, in other cases thay are related to improper deployment or configuration of the windows OS or applications within, and sometimes indicate catastrophic damage to windows has occurred often requiring rollback to prior known good states, disaster recovery, or complete server rebuild. Though a restart of services or windows itself will often temporarily solve a VSS issue, understand that if the underlying condition or event that caused VSS to enter a failed state re-occurs, the VSS error will also reoccur.
If VSS is non-functional, Unitrends will not be able to take safe backups of the server until the issue is resolved. Unitrends does not cause VSS errors (though the act of attempting a snapshot will often have the result of encountering the error). No custom code is used by Unitrends to manage VSS, it is a native core component of Microsoft's OS. When VSS has failed it is because Windows itself cannot internally perform the requested operation. No code or settings changes to Unitrends or our agent can by itself resolve these issues, they typically require repairing the OS itself or addressing infrastructure concerns.
Common VSS Errors
The list of errors below is not all inclusive. Also suggested common root causes or resolutions equally are not all-inclusive.
Often the best troubleshooting of VSS errors is done by locating the "0x8" or href code in addition to related Windows EVT event IDs and search them online. Common VSS error codes are listed below. Unitrends software may or may not encounter all of these codes. The below information is to provide general guidance.
Unitrends Support staff will have limitations on the amount of windows troubleshooting that can be done such to be able to provide guidance or recommendations when errors within windows itself are found, which will include researching Microsoft native errors to help determine potential root causes, but will typically recommend engagement of Microsoft support if the issue is not easily resolved. Should Unitrends support make such a recommendation, we will provide access to a senior engineer who will participate in 3rd party calls to Microsoft initiated by the customer to help accelerate resolution.
VSS_E_BAD_STATE (0x80042301): A function call was made when the object was in an incorrect state. This indicates that Microsoft's VSS framework and/or perhaps some of the VSS writers are in a bad state. A reboot may resolve temporarily but may also be caused by raw volumes being present in the server in an offline state. May be prevalent on CSV storage that is not deployed to Microsoft standards.
VSS_E_UNEXPECTED (0x80042302): A volume shadow copy service (VSS) component encountered an unexpected error. This error typically is associated with filesystem damage. Customers should perform chkdsk /F on impacted volumes followed by SFC and DISM scans.
VSS_E_PROVIDER_NOT_REGISTERED (0x80042304): The volume shadow copy provider is not registered in the system. This issue can only be resolved by registry repair. The shadow copies tab on disk properties will also not load.
VSS_E_PROVIDER_VETO (0x80042306): The shadow copy provider had an error. Windows explicitly rejected being able to perform a shadow copy at this time. The provider was unable to perform the request. This can be a transient problem, but it is also important to ensure no other backup providers are installed and the system has sufficient snapshot space locating in the volume being backed up and multiple backup types are not in use for the server.
VSS_E_PROVIDER_IN_USE (0x80042307): The shadow copy provider is currently in use and cannot be unregistered. Commonly seen in VM backups where the host cannot quiesce the guest. verify no 3rd party backup software is installed and no other backups or windows updates are being run during VM backups. Review windows EVT logs inside the guest for further troubleshooting.
VSS_E_OBJECT_NOT_FOUND (0x80042308): The specified object was not found. This error means that Microsoft VSS failed to take a snapshot of your file systems and that the backup job will be unable to backup any files that are opened exclusively by other applications. The most common cause of this error is that VSS has been disabled on one or more of the volumes that are part of the backup, or an issue with VSS DLL registration.
VSS_S_ASYNC_PENDING (0x00042309): The asynchronous operation is pending. This can be common on database servers where other critical operations are happening at the same time as the backup.
VSS_S_ASYNC_FINISHED (0x0004230A): The asynchronous operation has completed.
VSS_S_ASYNC_CANCELLED (0x0004230B): The asynchronous operation has been cancelled.
VSS_E_VOLUME_NOT_SUPPORTED (0x8004230C): Shadow copying the specified volume is not supported. this may indicate shadows have been disabled on a volume that should support them. check shadow settings for the volumes in the server.
VSS_E_VOLUME_NOT_SUPPORTED_BY_PROVIDER (0x8004230E):The given shadow copy provider does not support shadow copying the specified volume. This usually indicates the presence of non-Microsoft shadow providers. Removal of 3rd party backup software products other than Unitrends will typically resolve this issue.
VSS_E_OBJECT_ALREADY_EXISTS (0x8004230D): The object already exists. This usually indicates the presence of non-Microsoft shadow providers. Removal of 3rd party backup software products other than Unitrends will typically resolve this issue.
VSS_E_UNEXPECTED_PROVIDER_ERROR (0x8004230F): The shadow copy provider had an unexpected error while trying to process the specified operation. This usually indicates the presence of non-Microsoft shadow providers. Removal of 3rd party backup software products other than Unitrends will typically resolve this issue.
VSS_E_CORRUPT_XML_DOCUMENT (0x80042310): The given XML document is invalid. It is either incorrectly-formed XML or it does not match the schema. This is commonly associated with running or failed windows updates. Ensure updates and any other driver or application installations have completed successfully and windows is in a clean reboot state.
VSS_E_INVALID_XML_DOCUMENT(0x80042311): The given XML document is invalid. It is either incorrectly-formed XML or it does not match the schema. This is commonly associated with running or failed windows updates. Ensure updates and any other driver or application installations have completed successfully and windows is in a clean reboot state.
VSS_E_MAXIMUM_NUMBER_OF_VOLUMES_REACHED (0x80042312): The maximum number of volumes for this operation has been reached. This condition is potentially unsafe as it means older shadow copies are not being properly removed. In some cases Antivirus scanning activity may be interrupting operations, use of multiple backup products, incomplete windows updates, too frequent backups on a system with limited resources, etc. Safely removing stale snapshots is required but determining why they get hung should also be pursued.
VSS_E_FLUSH_WRITES_TIMEOUT (0x80042313): The shadow copy provider timed out while flushing data to the volume being shadow copied. This is probably due to excessive activity on the volume. Try again later when the volume is not being used so heavily. May be more common on systems with limited resources or slow disks.
VSS_E_HOLD_WRITES_TIMEOUT (0x80042314): The shadow copy provider timed out while holding writes to the volume being shadow copied. This is probably due to excessive activity on the volume by an application or a system service. Try again later when activity on the volume is reduced. May be more common on systems with limited resources or slow disks.
VSS_E_UNEXPECTED_WRITER_ERROR (0x80042315): VSS encountered problems while sending events to writers. May occur if other backup products are installed, or if multiple shadows are tying to create or consolidate concurrently, or if backups run too frequently without giving windows sufficient time to clean up after a prior job. Can also be related to limited server resources.
VSS_E_SNAPSHOT_SET_IN_PROGRESS (0x80042316): Another shadow copy creation is already in progress. Wait a few moments and try again. Ensure no other backup products are installed, that multiple types of backups are not being attempted for the same server, and that backups are not too frequent for available system resources.
VSS_E_MAXIMUM_NUMBER_OF_SNAPSHOTS_REACHED (0x80042317):The specified volume has already reached its maximum number of shadow copies. The volume has been added to the maximum number of shadow copy sets. The specified volume was not added to the shadow copy set. Other possible reasons: There is not enough free disk space on the drive where the locked file is located. There could be another software that is already using the shadow volume for the drive. Restart your computer and try again.
VSS_E_WRITER_INFRASTRUCTURE (0x80042318): An error was detected in the VSS. The problem occurred while trying to contact VSS writers. This typically requires insuring the default Microsoft system writer is forced. Commonly seen if other backup providers are installed.
VSS_E_WRITER_NOT_RESPONDING (0x80042319): A writer did not respond to a GatherWriterStatus call. The writer might have terminated, or it might be stuck. Commonly occurs if there is disk corruption requiring chkdsk /f to resolve or when free space for shadows is limited.
VSS_E_WRITER_ALREADY_SUBSCRIBED (0x8004231A): The writer has already successfully called the Subscribe function. It cannot call Subscribe multiple times. There could be backups scheduled too frequently for this system or multiple backup types being tried concurrently.
VSS_E_UNSUPPORTED_CONTEXT (0x8004231B): The shadow copy provider does not support the specified shadow copy type. Shadow space on the volume may be too small, or could be redirected to an alternate volume that is not part of the backup scope or does not support VSS operations.
VSS_E_VOLUME_IN_USE (0x8004231D): The specified shadow copy storage association is in use and so can't be deleted. This may occur if shadows were in use when the volume was added to a cluster, or failed over with a shadow in use, or if another system is sharing the disk and has it's own snapshot in use. Manual snapshot removal is often necessary to resolve.
VSS_E_MAXIMUM_DIFFAREA_ASSOCIATIONS_REACHED (0x8004231E): Maximum number of shadow copy storage associations already reached. Ensure multiple backups are not in use and 3rd party software is uninstalled. remove existing snapshots and assess snapshot storage space.
VSS_E_INSUFFICIENT_STORAGE (0x8004231F): Insufficient storage available to create either the shadow copy storage file or other shadow copy data. Reassess snapshot storage space use. delete stale snapshots. this is common when volumes have less than Microsoft recommended 10% free space.
VSS_E_REBOOT_REQUIRED (0x80042327): A reboot is required after completing this operation. Windows update or MSIExec has a snapshot in use. Updates/installations must be finished and the server rebooted properly before snapshots can be used.
VSS_E_VOLUME_NOT_LOCAL (0x8004232D): The volume being backed up is not mounted on the local host. This can occur when attempts to back up a VM that is part of a cluster is attempted or a cluster volume is included in a backup that fails over to another host after the backup is initiated. May also occur in some cases with iSCSI or FibreChannel connected mounts where the storage system uses 3rd party VSS providers. See best practices for protecting cluster resources.
VSS_E_CLUSTER_TIMEOUT (0x8004232E): A timeout occurred while preparing a cluster shared volume for backup. Commonly occurs when cluster configuration does not meet Microsoft requirements or best practices. Ensure you are using the correct type of backup for your cluster and are protecting the cluster through the active node. Commonly seen on hyper-V clusters where guests from different hosts are in the same CSV which must be corrected.
VSS_E_WRITERERROR_INCONSISTENTSNAPSHOT (0x800423F0): The shadow copy set only contains only a subset of the volumes needed to correctly backup the selected components of the writer. This can occur when attempting backups of certain Sharepoint servers in farm configuration, especially after updates have occurred. Note system state backups of farm servers are not viable for recovery and VM and image backups of SharePoint are not supported. for non-sharepoint servers this may also indicate file system corruption.
VSS_E_WRITERERROR_OUTOFRESOURCES (0x800423F1): A resource allocation failed while processing this operation. Windows ran out of memory. Reassess server requirements and ensure applications are not causing memory leaks of hangs. A reboot is usually a short term resolution only for this issue.
VSS_E_WRITERERROR_TIMEOUT (0x800423F2): The writer's timeout expired between the Freeze and Thaw events. This particular error is typically not seen on patched servers but was an unpatched issue years ago. If the SAM registry writer has pending tasks during snapshot consolidation at end of a backup this can occur.
VSS_E_WRITERERROR_RETRYABLE (0x800423F3): The writer experienced a transient error. If the backup process is retried, the error might not reoccur. Can occur for many reasons. Common reasons are non-responsive USB volumes in windows, a very large number of volumes to snapshot, many SQL databases present in a single volume, too little windows resource especially due to low IO performance, other concurrent snapshot operations from other backups, too frequent backups, backups and windows updates running concurrently, or cluster systems not deployed to Microsoft best practices. Server architectural changes are commonly required to permanently resolve this issue. A reboot may temporarily resolve. Unitrends sees this most commonly on Hyper-V host clusters that do not follow Microsoft deployment guidelines. Run Best Practice Analyzer with the storage option (requires VMs to be off) and review results on HV clusters to resolve.
VSS_E_WRITERERROR_NONRETRYABLE (0x800423F4): The writer experienced a non-transient error. If the backup process is retried, the error is likely to reoccur. Commonly causes by SQL VSS or Hyper-V VSS issues and requires a reboot to resolve. Often due to incomplete windows updates or multiple backup types conflicting for the server. May also be due to insufficient server resources. Can also be seen where Azure AD sync is failing or in cluster configurations when nodes are not in a healthy sync state.
VSS_E_WRITERERROR_RECOVERY_FAILED (0x800423F5): The writer experienced an error while trying to recover the shadow copy volume. Can be caused by windows resource overuse, or conflicting backup providers. Ensure no other backup software is installed and multiple backups of the same machine are not in use.
VSS_E_LEGACY_PROVIDER (0x800423F7): This version of the hardware provider does not support this operation. SAN storage vendors often add hardware providers for VSS. Ensure you are not using SAN Snapshot technology in addition to backups and that your SAN drivers are current.
VSS_E_MISSING_DISK (0x800423F8): An expected disk did not arrive in the system. ensure a disk that was present at time of initiation of backup remains connected through the entire operation. Some IPKVM systems provide passthru storage to windows which may appear as a local disk instead of a removable disk. This can also occur when a cluster has not cleanly failed over and may require failover and failback or complete cluster restart to resume normal operation.
VSS_E_MISSING_HIDDEN_VOLUME (0x800423F9): An expected hidden volume did not arrive in the system. Check the Application event log for more information. This is a bug encountered in Windows 2019 and newer systems related to a defect Microsoft has patched. typically seen when performing VM backups of Win2019 guests. Update windows to resolve. A similar defect was present in hyper-V 2012 and 2016 clusters.
VSS_E_MISSING_VOLUME (0x800423FA): An expected volume did not arrive in the system. Check the Application event log for more information. May be seen with clusters in an unhealthy state. Can also be seen with offline disks or drives that use low power states that fail to respond within windows.
VSS_E_DYNAMIC_DISK_ERROR (0x800423FC): An error occurred in processing the dynamic disks involved in the operation. Dynamic disks are depreciated from support by Microsoft and should not be used except for OS disk mirroring in any currently supported version of windows. Note use of dynamic disk boot mirrors prevents the ability to perform server recovery using BareMetal, windows replicas, and instant recovery. By best practice, any server with dynamic disks should be rebuilt to not have that configuration.
VSS_E_CLUSTER_ERROR (0x80042400): The clustered disks could not be enumerated or could not be put into cluster maintenance mode. Check the System event log for cluster related events and the Application event log for VSS related events.
VSS_E_UNSELECTED_VOLUME (0x8004232A): The requested operation would overwrite a volume that is not explicitly selected. For more information, check the Application event log. May persist in environments with volume synchronization, clustering, or ADFS are not functioning properly.
VSS_E_SNAPSHOT_NOT_IN_SET (0x8004232B): The shadow copy ID was not found in the backup components document for the shadow copy set. This indicates a functional failure in windows to manage it's own shadow copies. removing stale copies may be required.
VSS_E_NESTED_VOLUME_LIMIT (0x8004232C):The specified volume is nested too deeply to participate in the VSS operation. Occurs on a hyper-V host level backup where a VM's internal OS though disk manager has mounted a VHD/VHDX file directly that is part of storage the host OS is snapshotting. Instead, use hyper-V to connect the VHD to the Vm from the host level.
VSS_E_NOT_SUPPORTED (0x8004232F): The requested operation is not supported. indicates the disk or critical objects in the disk are excluded from VSS via the registry but were attempted to be included in a backup. May be seen on 3rd party database servers and usually indicates direct backup of the appliation cannot be supproted and will require 3rd party tools to protect.
VSS_E_WRITERERROR_PARTIAL_FAILURE (0x80042336): The writer experienced a partial failure. Check the component level error state for more information. Produced by Hyper-V VSS writer. typically indicates a VM prevented snapshot, commonly when migrating between storage owned by a different VM host or where CSV cluster architecture best practices and VM separation have not been followed.
VSS_E_WRITER_STATUS_NOT_AVAILABLE (0x80042409): Writer status is not available for one or more writers. A writer might have reached the limit to the number of available backup-restore session states. Ensure too-frequent backups or multiple backup types are not being used. Common on SQL and hyper-V servers with multiple parallel backups as well as in clusters. Care may need to be taken to limit the number of concurrent backups against a cluster.
VSS_E_KEY_DELETED (0x800703fa), Illegal operation attempted on a registry key that has been marked for deletion.
RESOLUTIONS AND TROUBLESHOOTING
Sometimes just simple steps like the following will help to resolve issues with writers:
- Ensure no backups or restores are running (no wbps.exe processes should be active in windows)
- If performing a VM backup, ensure no VM snapshots are present and the VM is not currently migrating to alternate storage and that VM and in-guest backups are not both scheduled.
- stop the impacted VSS Writer Service in windows, for example the VSS SQL Writer
- Stop VSS core Service itself
- Wait 30 sec
- Start VSS Service
- Start impacted VSS Writer Service
- Run command "vssadmin list writers"
- Verify, if any errors related to writer
- reattempt the backup operation
- In case of errors check System and Application Events Log and see VSS and impacted VSS Writer error events and warning for more information and attempt to remediate
- if errors persist, reboot the OS and retry steps 7-10.
For additional troubleshooting steps, see also:
Windows Hot Fixes
Unitrends assumes all currently available windows patches and relevant hotfixes are applied to servers. This will also include updates for Microsoft applications like SQL and Exchange and manufacturer drivers as well. Additionally, after windows updates run it is often critical to ensure a reboot occurs if any fixes require a reboot. As windows updates without rebooting are commonly done, this is the largest reason why a simple reboot tends to resolve most VSS errors, but this is often symptomatic of failing to properly manage windows updates and reboots.
Controlling the VSS diff area size:
If after applying these fixes, one of the following events occurs:
- "The shadow copy of volume C: took too long to install"
- "The shadow copy of volume C: was aborted because the diff area file could not grow in time."
Consider reducing the I/O load on this system to avoid this problem in the future. If these events still occur, then the following registry key can be used to control the size of the diff area used by VSS:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\VolSnap\MinDiffAreaFileSize : REG_DWORD :
(the default size is 300, you can increase it up to 3000)
Recommended event log maximum size: Microsoft indicates if the event logs are sufficiently large, the copy operation can take longer than the timeout for systems with high I/O load or high memory load. Microsoft recommends that Event logs are kept below 64MB in size.
Diagnostic Information for Microsoft Assistance
When it is determined that the VSS failure is outside the scope of Unitrends windows client, then the following information should be gathered for Microsoft support to examine:
- Windows application event log
- Windows system event log
- VSS trace (see instructions below)
Unitrends senior engineers will participate in calls customers make to Microsoft on customer request. We cannot initiate these calls, but will not leave customers to deal with Microsoft support on their own.
Examine the application and system event logs focusing on the error events created by the VolSnap and VSS sources at the time of failure. It is helpful to extract the germane events from the log to isolate the problem and have a more productive interaction with MS support.
How to perform a VSS trace:
vsstrace is a tool provided by Microsoft as part of the windows software developer toolkit (SDK). This is an advanced tool used for VSS data collection to be used with Microsoft senior support when investigating windows defects or damage. This tool and it;s data are not used by Unitrends, but we recommend collection of this data for any customer who is advised to engage Microsoft support to determine the nature of their VSS errors.
The SDK is available for most versions of windows here and must be first installed before vsstrace can be used: https://developer.microsoft.com/en-us/windows/downloads/windows-sdk/
Instructions for using vsstrace are here: https://learn.microsoft.com/en-us/windows/win32/vss/using-tracing-tools-with-vss
I/O performance diagnosis:
Use Microsoft I/O performance tools to gather data for analysis. Check disk defragmentation. Checking health of writers Unitrends windows agent uses the VSS interface to read SQL data from the disk. If there is a problem with the VSS SQL writer responsible for SQL, use the Microsoft utility vssadmin.exe to ensure that the SQL writer is available on the system if a problem is suspected.
For more information see this Microsoft KB "Registry Keys and Values for Backup and Restore".