Storage Vault Locks

Lock files are an important part of Magnus Box's safety design. Magnus Box uses lock files to ensure data consistency during concurrent operations.

Problem statement 

Magnus Box supports multiple devices backing up into a shared Storage Vault simultaneously.

A retention pass first looks at what data chunks exist in the Vault, then secondly deletes the unused ones.

A backup job first looks at what data chunks exist in the Vault, then secondly uploads new chunks from the local data, and uploads a backup snapshot that relies on both pre-existing and newly-uploaded chunks.

It is perfectly safe for multiple backup jobs to run simultaneously, even from multiple devices.

In Magnus Box 21.9.4 and later, it is also perfectly safe to run backup and retention jobs simultaneously. A mark-and-sweep algorithm is used to safely delete data chunks while other backup operations are proceeding concurrently.

Lock files 

In order to check whether a retention pass is currently running, Magnus Box must communicate between all devices that could potentially be using the Storage Vault.

In order to determine whether any other device is actively using a Storage Vault, Magnus Box writes a temporary text file into the Storage Vault, and deletes it when the job is completed. This is the only mechanism supported across all Storage Vault types (i.e. local disk / SFTP / S3 / etc). Then, any other job can look for these files to see what other operations are taking place concurrently.

Downsides of lock file design 

If Magnus Box is stopped suddenly (e.g. PC crash), the lock file would not be removed. All other Magnus Box processes would not realize that the job had stopped. This could prevent the proper functioning of backup jobs and/or retention passes.

Magnus Box will alert you to this issue by failing the job. The error message should explain which device and/or job was responsible for originating the now-stale lock file.

You may see error messages of the form:

  • Locked by user '...' on this device (PID #...) since ... (... days ago)
  • Locked by user '...' on computer '...' (PID #...) since ... (... days ago)
  • However, the responsible process might have stopped.
  • If you investigate this process, and are absolutely certain it won't resume, then it's safe to ignore it and continue.

It is possible to delete lock files to recover from this situation. However, you MUST investigate the issue to ensure that the responsible process really has stopped. Consider that a PC may go to sleep at any time, and wake up days or weeks later, and immediately resume from the middle of a backup or retention operation; if the lock files were removed incorrectly, data loss is highly likely.

If you are sure that the responsible process is stopped, you can delete the lock files.

You can initiate this either

Backup strategies to avoid lock file issues 

  • If you are experiencing lock file errors where the duration is 5 minutes or less, consider rescheduling the backups, moving a backup ahead or behind by 30 minutes.
  • If you are experiencing lock file issues of more than 5 minutes (anywhere from several hours to multiple days of the storage vault being locked), consider moving one or more of the devices to a separate storage vault where the backups will not be affected by the other devices.

Automatic unlock

Magnus Box will automatically delete stale lock files when it determines that it is safe to do so.

  • When Magnus Box is running on the same PC as a potentially-stale lock file, it can check the running processes to see if the originator process is still running.

Recovering from unsafe unlock operations

If you encounter a Packindex '...' for snapshot '...' refers to unknown pack '...', shouldn't happen error, a data file has been erroneously deleted inside the Storage Vault. Data has been lost. This can happen if the "Unlock" feature is used without proper caution as advised above.

Please contact Magnus Box support for assistance.

Error "Found X packs in index but not appearing on disk. Reindex needed!"

The message indicates that X number of index files refer(s) to data chunks that cannot currently be found inside the Storage Vault.

Encrypted data chunks in the Storage Vaults are carefully curated and indexed. It is still possible for the index of data chunks to become mismatched with the actual chunks stored. This can happen:

  • if a file was manually deleted from the Storage Vault, or
  • if your storage platform experienced data loss for any reason (e.g. RAID failure), or
  • if the Unlock action was used in an unsafe way, or
  • if a retention pass went back in time (e.g. via VM snapshot resumption)

Solution

  • Toggle on Admin -> Advanced Options
  • Accounts -> Users -> name -> Devices tab -> click 'Online' to open dialogue -> Storage Vault tab -> Reindex
  • When done, Toggle off Admin -> Advanced Options
  • Accounts -> Users -> name -> Devices tab -> click 'Online' to open dialogue -> Storage Vault tab -> Apply Retention Rules Now
  • When done, Accounts -> Users -> name -> Actions -> View Job History -> 'Report' to view last job
  • An absence of errors indicates the optimization pass found the missing data chunk(s) somewhere in the Storage Vault, and created a new index to reference them. The issue is solved.
  • If errors, check documentation for solutions to the error message

Error "<packindex/pppppppp> says <snapshot/ssssssss> depends on missing pack <data/dddddddd>, non-restorable!"

The message is similar to the 'Reindex needed' issue. It indicates that a specific backup snapshot, 'ssssssss', refers to data chunks, 'dddddddd', that cannot currently be found inside the Storage Vault. The specified backup snapshot is not currently restorable.

The error message may show more than one snapshot.

The index of encrypted data chunks in the Storage Vaults is carefully curated. It is still possible for the index of data chunks to become mismatched with the actual chunks stored. This can happen:

  • if a file was manually deleted from the Storage Vault, or
  • if your storage platform experienced data loss for any reason (e.g. RAID failure), or
  • if the Unlock action was used in an unsafe way, or
  • if a retention pass went back in time (e.g. via VM snapshot resumption)

Please contact Magnus Box support for assistance.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us