Data retention policy is a set of rules that defines how long different kinds of data should be stored as well as how and when the data should be cleaned up once the retention period is over.
Retention policies help organisations to accomplish different goals including:
- comply with various legal or compliance requirements for data storage (HIPAA, FINRA, SOX);
- get rid of obsolete or stale data that occupies significant storage space and makes navigation across the company data more complex.
To support these requirements, Afi Backup offers item-level and backup version retention policies that clean up the data after it reaches the certain age defined by the policy.
Supported retention policies
Item-level retention
With item-level retention backup items (ex. emails or files) are deleted after their last modification date exceeds the retention period specified by the policy. See Item-level retention rules for Mail and Chats data and Item-level retention rules for Files data for more details.
Item-level retention is supported for the following kinds of data:
- Emails (includes Google Mail and Google Chats data for Google Workspace; Exchange Mail, Online Archive, Group Mail data for Microsoft 365);
- Chats (Teams Chats and Channels for Microsoft 365);
- Files (includes Google Drive and Shared Drives data for Google Workspace; OneDrive, OneNote and Sharepoint sites data for Microsoft Office 365).
Backup version retention
Backup version retention policy defines how long the system should keep historical backup snapshots. See How does a backup version retention policy work for more details.
How do retention policies work
Retention policy is configured as a backup SLA property and is applied to the resources protected by the corresponding SLA. The data that is older than the retention period is removed by the cleanup procedure launched during the periodic backup. If protection (SLA) is removed from the resource (user, drive, site, etc) or any SLA without configured retention rules is assigned to the resource, the system stops applying retention rules to the resource and the data that reached retention age will no longer be cleaned up. For the same reason, retention rules are not applied to backups belonging to deleted or suspended users.
Please note that, for performance reasons, cleanup procedure is not performed on every backup, but is launched with a certain probability depending on the retention period duration and last cleanup time. Due to this fact, backups with a custom data retention period configured can still contain some items that are older than the retention age and that will be removed during the next cleanup (time lag for retention cleanup doesn't exceed one month). Also need to mention, that in case of retention period change (for example, from 5 years to 3 years), first cleanup after the change can also happen with a delay.
Let's discuss how different retention rules are applied.
Item-level retention rules for Mail and Chats data
With an item-level email retention policy set up, all the emails with received date older than the retention period will be deleted from all historical backup snapshots. Emails that were recently moved between labels (folders) or marked as read/unread are still cleaned up based on their original received date (i.e. such events don't reset email age).
Example: let's suppose that user A receives one report email per day during 3 months (1st February, 2nd February, ..., 1st March, ..., 30th April) and has 90 emails in the mailbox on 1st May. On the 1st May 1-month email retention policy for this user is set up and retention cleanup is done. During the cleanup emails older than 1st April will be removed and emails dated from 1st to 30th April will remain (emails from the last 30 days).
While browsing historical snapshots for backups with enabled retention, you can encounter deleted item placeholders for the items already removed by the cleanup procedure. The data and metadata for these placeholder items are deleted, but they remain present in the browsing view due to the implementation details and to provide better visibility for the retention process.
Item-level retention rules for Files data
With an item-level file retention policy set up, all the files with modification date older than the retention period will be deleted from all historical backup snapshots. Files with creation date older that the retention period, but with newer modification date will be preserved. Files deleted in Google Workspace / Microsoft 365 that are still present in older backup snapshots are also cleaned up based on their last modified date. If for any reason files older than retention age are still present in the corresponding Google Workspace / Microsoft 365 account (although generally the best practice is to setup the same retention policies across the services used by company), they either will not be backed up at all or removed during the next cleanup procedure.
Need to mention that file cleanup job doesn't delete folders regardless of their age so it's expected that the cleanup job will remove files older than the retention age, but will not touch any folders.
Backup version retention rules
With backup version retention backup versions (snapshots) that are older than retention period are deleted and backup items (ex. emails or files) are kept as long as there is at least one snapshot where the item is visible. Snapshot represents a state of the mailbox, drive, site or team at the specific point in time, a backup item (file/email) is visible in the snapshot if the item is present in the mailbox (drive, site or team) being backed up at the time of the backup. Please note that snapshot refers to all the items present in the corresponding mailbox (drive, site or team), not only the ones that were modified or created at the time of this snapshot. As a result, backup version retention cleans up items that were deleted or old item versions that were rewritten before current retention period start (these items are inaccessible in all the remaining snapshots).
Example: let's suppose that user A has document X created on 1st February, then edited on 1st March, 15th April and 1st May; document Y created on 1st March and deleted on 2nd March; document Z created on 1st February and not modified since. User A's drive is backed up daily. On the 1st May 1-month backup version retention policy for this user is set up and retention cleanup is done. During the cleanup the system deletes versions of document X from 1st February and 1st March and all the versions of document Y; versions of document X from 15th April and 1st May as well as document Z are preserved as they belong to the snapshots within the retention period.
How to configure and manage retention rules
Retention rules are configured as a backup SLA property that defines which data to backup, with what frequency and how long to keep the backup data. The following steps are required to setup custom retention rules:
- go to Service -> Protection -> SLA tab, select existing SLA policy or create a new one, then choose retention mode (item-level or backup version) and retention period duration;
- after SLA policy is set up, assign it to resources or organisational units (Google Workspace) or resource groups (Microsoft 365) that you want to protect with this policy.
Here is the example of item-level retention set up - 1 year retention period for mail data and 6 months retention period for drive data:
Here is the example of backup version retention set up - 1 year snapshots retention period.
It is possible to configure different retention rules for different Google Workspace organisational units or for members of different Office 365 groups by configuring several backup SLA policies with different retention rules and assigning these backup SLAs to the corresponding organisational units or groups.
Retention period can be configured with a month granularity (for example, 6 months, 1 year, 3 years, etc).
Example: organisation administrator might want to protect all of the members of Sales organisational unit with a Gold backup SLA with 3 years email retention and at the same time protect all the Shared Drives with a Silver backup SLA with 1 year document retention.