3.9. Always Incremental Backup Scheme

Always Incremental Backups are available since Bareos 16.2.4.

3.9.1. Conventional Backup Scheme Drawbacks

To better understand the advantages of the Always Incremental Backup scheme, we first analyze the way that the conventional Incremental - Differential - Full Backup Scheme works.

The following figure shows the jobs available for restore over time. Red are full backups, green are differential backups and blue are incremental Backups. When you look for a data at the horizontal axis, you see what backup jobs are available for a restore at this given time.

image

The next figure shows the amount of data being backed up over the network from that client over time:

image

Depending on the retention periods, old jobs are removed to save space for newer backups:

image

The problem with this way of removing jobs is the fact that jobs are removed from the system which existing jobs depend on.

3.9.2. Always Incremental Concept

The Always Incremental Backup Scheme does only incremental backups of clients, which reduces the amount of data transferred over the network to a minimum.

The Always Incremental Backup Scheme works as follows:

Client Backups are always run as incremental backups. This would usually lead to an unlimited chain of incremental backups that are depend on each other.

To avoid this problem, existing incremental backups older than a configurable age are consolidated into a new backup.

These two steps are then executed every day:

  • Incremental Backup from Client
  • Consolidation of the jobs older than maximum configure age

Deleted files will be in the backup forever, if they are not detected as deleted using AccurateDirJob backup.

The Always Incremental Backup Scheme does not provide the option to have other longer retention periods for the backups.

For Longterm Storage of data longer than the Always Incremental Job Retention, there are two options:

  • A copy job can be configured that copies existing full backups into a longterm pool.
  • A virtual Full Job can be configured that creates a virtual full backup into a longterm pool consolidating all existing backups into a new one.

The implementation with copy jobs is easy to implement and automatically copies all jobs that need to be copied in a single configured resource. The disadvantage of the copy job approach is the fact that at a certain point in time, the data that is copied for long term archive is already “always incremental job retention” old, so that the data in the longterm storage is not the current data that is available from the client.

The solution using virtual full jobs to create longterm storage has the disadvantage, that for every backup job the a new longterm job needs to be created.

The big advantage is that the current data will be transferred into the longterm storage.

The way that bareos determines on what base the next incremental job will be done, would choose the longterm storage job to be taken as basis for the next incremental backup which is not what is intended. Therefore, the jobtype of the longterm job is updated to “archive”, so that it is not taken as base for then next incrementals and the always incremental job will stand alone.

3.9.3. How to configure in Bareos

3.9.3.1. Always Incremental Backup Job

To configure a job to use Always Incremental Backup Scheme, following configuration is required:

bareos-dir job example
Job {
    ...
    Accurate = yes
    Always Incremental = yes
    Always Incremental Job Retention = <timespec>
    Always Incremental Keep Number = <number>
    ...
}
AccurateDirJob = yes
is required to detect deleted files and prevent that they are kept in the consolidated backup jobs.
Always IncrementalDirJob = yes
enables the Always Incremental feature.
Always Incremental Job RetentionDirJob
set the age where incrementals of this job will be kept, older jobs will be consolidated.
Always Incremental Keep NumberDirJob
sets the number of incrementals that will be kept without regarding the age. This should make sure that a certain history of a job will be kept even if the job is not executed for some time.
Always Incremental Max Full AgeDirJob
is described later, see Always Incremental Max Full Age.

3.9.3.2. Consolidate Job

bareos-dir job Consolidate
Job {
    Name = "Consolidate"
    Type = "Consolidate"
    Accurate = "yes"
    JobDefs = "DefaultJob"
}
TypeDirJob = Consolidate
configures a job to be a consolidate job. This type have been introduced with the Always Incremental feature. When used, it automatically trigger the consolidation of incremental jobs that need to be consolidated.
AccurateDirJob = yes
let the generated virtual backup job keep the accurate information.
Max Full ConsolidationsDirJob
is described later, see Max Full Consolidations.

The ConsolidateDirjob job evaluates all jobs configured with Always IncrementalDirJob = yes. When a job is selected for consolidation, all job runs are taken into account, independent of the pool and storage where they are located.

The always incremental jobs need to be executed during the backup window (usually at night), while the consolidation jobs should be scheduled during the daytime when no backups are executed.

Warning

All Bareos job resources have some required directives, e.g. ClientDirJob .
Even so, none other than the mentioned directives are evaluated by a resourceDirectiveValue{Dir{Job}{Type}{Consolidate}, they still have to be defined. Normally all required directives are already set in resourceDirectiveValue{Dir}{Job}{Job Defs}{DefaultJob}. If not, you have to add them. You can use arbitrary, but valid values.}

3.9.3.3. Storages and Pools

For the Always Incremental Backup Scheme at least two storages are needed. See Using Multiple Storage Devices how to setup multiple storages.

bareos-dir pool AI-Incremental
Pool {
  Name = AI-Incremental
  Pool Type = Backup
  Recycle = yes                       # Bareos can automatically recycle Volumes
  Auto Prune = yes                    # Prune expired volumes
  Volume Retention = 360 days         # How long should jobs be kept?
  Maximum Volume Bytes = 50G          # Limit Volume size to something reasonable
  Label Format = "AI-Incremental-"
  Volume Use Duration = 23h
  Storage = File1
  Next Pool = AI-Consolidated         # consolidated jobs go to this pool
}
bareos-dir pool AI-Consolidated
Pool {
  Name = AI-Consolidated
  Pool Type = Backup
  Recycle = yes                       # Bareos can automatically recycle Volumes
  Auto Prune = yes                    # Prune expired volumes
  Volume Retention = 360 days         # How long should jobs be kept?
  Maximum Volume Bytes = 50G          # Limit Volume size to something reasonable
  Label Format = "AI-Consolidated-"
  Volume Use Duration = 23h
  Storage = File2
  Next Pool = AI-Longterm             # copy jobs write to this pool
}
bareos-dir pool AI-Longterm
Pool {
  Name = AI-Longterm
  Pool Type = Backup
  Recycle = yes                       # Bareos can automatically recycle Volumes
  Auto Prune = yes                    # Prune expired volumes
  Volume Retention = 10 years         # How long should jobs be kept?
  Maximum Volume Bytes = 50G          # Limit Volume size to something reasonable
  Label Format = "AI-Longterm-"
  Volume Use Duration = 23h
  Storage = File1
}

AI-LongtermDirpool is optional and will be explained in Long Term Storage of Always Incremental Jobs.

3.9.4. How it works

The following configuration extract shows how a client backup is configured for always incremental Backup. The Backup itself is scheduled every night to run as incremental backup, while the consolidation is scheduled to run every day.

bareos-dir job BackupClient1
Job {
    Name = "BackupClient1"
    JobDefs = "DefaultJob"

    # Always incremental settings
    AlwaysIncremental = yes
    AlwaysIncrementalJobRetention = 7 days

    Accurate = yes

    Pool = AI-Incremental
    Full Backup Pool = AI-Consolidated
}
bareos-dir job Consolidate
Job {
    Name = "Consolidate"
    Type = "Consolidate"
    Accurate = "yes"
    JobDefs = "DefaultJob"
}

The following image shows the available backups for each day:

image

  • The backup cycle starts with a full backup of the client.
  • Every day a incremental backup is done and is additionally available.
  • When the age of the oldest incremental reaches Always Incremental Job RetentionDirJob , the consolidation job consolidates the oldest incremental with the full backup before to a new full backup.

This can go on more or less forever and there will be always an incremental history of Always Incremental Job RetentionDirJob .

The following plot shows what happens if a job is not run for a certain amount of time.

image

As can be seen, the nightly consolidation jobs still go on consolidating until the last incremental is too old and then only one full backup is left. This is usually not what is intended.

For this reason, the directive Always Incremental Keep NumberDirJob is available which sets the minimum number of incrementals that should be kept even if they are older than Always Incremental Job RetentionDirJob .

Setting Always Incremental Keep NumberDirJob to 7 in our case leads to the following result:

image

Always Incremental Keep NumberDirJob incrementals are always kept, and when the backup starts again the consolidation of old incrementals starts again.

3.9.5. Enhancements for the Always Incremental Backup Scheme

Besides the available backups at each point in time which we have considered until now, the amount of data being moved during the backups is another very important aspect.

We will have a look at this aspect in the following pictures:

3.9.5.1. The basic always incremental scheme

The basic always incremental scheme does an incremental backup from the client daily which is relatively small and as such is very good.

During the consolidation, each day the full backup is consolidated with the oldest incremental backup, which means that more or less the full amount of data being stored on the client is moved. Although this consolidation only is performed locally on the storage daemon without client interaction, it is still an enormous amount of data being touched and can take an considerable amount of time.

If all clients use the “always incremental” backup scheme, this means that the complete data being stored in the backup system needs to be moved every day!

This is usually only feasible in relatively small environments.

The following figure shows the Data Volume being moved during the normal always incremental scheme.

  • The red bar shows the amount of the first full backup being copied from the client.
  • The blue bars show the amount of the daily incremental backups. They are so little that the can be barely seen.
  • The green bars show the amount of data being moved every day during the consolidation jobs.

image

3.9.5.2. Always Incremental Max Full Age

To be able to cope with this problem, the directive Always Incremental Max Full AgeDirJob was added. When Always Incremental Max Full AgeDirJob is configured, in daily operation the Full Backup is left untouched while the incrementals are consolidated as usual. Only if the Full Backup is older than Always Incremental Max Full AgeDirJob , the full backup will also be part of the consolidation.

Depending on the setting of the Always Incremental Max Full AgeDirJob , the amount of daily data being moved can be reduced without losing the advantages of the always incremental Backup Scheme.

Always Incremental Max Full AgeDirJob must be larger than Always Incremental Job RetentionDirJob .

The resulting interval between full consolidations when running daily backups and daily consolidations is Always Incremental Max Full AgeDirJob - Always Incremental Job RetentionDirJob .

Data Volume being moved with "Always Incremental Max Full Age"

Data Volume being moved with “Always Incremental Max Full Age”

Jobs Available with "Always Incremental Max Full Age"

Jobs Available with “Always Incremental Max Full Age”

3.9.5.3. Max Full Consolidations

When the Always Incremental Max Full AgeDirJob of many clients is set to the same value, it is probable that all full backups will reach the Always Incremental Max Full AgeDirJob at once and so consolidation jobs including the full backup will be started for all clients at once. This would again mean that the whole data being stored from all clients will be moved in one day.

The following figure shows the amount of data being copied by the virtual jobs that do the consolidation when having 3 identically configured backup jobs:

image

As can be seen, virtual jobs including the full are triggered for all three clients at the same time.

This is of course not desirable so the directive Max Full ConsolidationsDirJob was introduced.

Max Full ConsolidationsDirJob needs to be configured in the TypeDirJob = Consolidate job:

bareos-dir job Consolidate
Job {
    Name = "Consolidate"
    Type = "Consolidate"
    Accurate = "yes"
    JobDefs = "DefaultJob"

    Max Full Consolidations = 1
}

If Max Full ConsolidationsDirJob is configured, the consolidation job will not start more than the specified Consolidations that include the Full Backup.

This leads to a better load balancing of full backup consolidations over different days. The value should configured so that the consolidation jobs are completed before the next normal backup run starts.

The number of always incremental jobs, the interval that the jobs are triggered and the setting of Always Incremental Max Full AgeDirJob influence the value that makes sense for Max Full ConsolidationsDirJob .

Data Volume being moved with Max Full Consolidations = 1

Data Volume being moved with Max Full Consolidations = 1

Jobs Available with Max Full Consolidations = 1

Jobs Available with Max Full Consolidations = 1

3.9.6. Long Term Storage of Always Incremental Jobs

What is missing in the always incremental backup scheme in comparison to the traditional “Incremental Differential Full” scheme is the option to store a certain job for a longer time.

When using always incremental, the usual maximum age of data stored during the backup cycle is Always Incremental Job RetentionDirJob .

Usually, it is desired to be able to store a certain backup for a longer time, e.g. monthly a backup should be kept for half a year.

There are two options to achieve this goal.

3.9.6.1. Copy Jobs

The configuration of archiving via copy job is simple, just configure a copy job that copies over the latest full backup at that point in time.

As all full backups go into the AI-ConsolidatedDirpool , we just copy all uncopied backups in the AI-ConsolidatedDirpool to a longterm pool:

bareos-dir job CopyLongtermFull
Job {
  Name = "CopyLongtermFull"
  Schedule = LongtermFull
  Type = Copy
  Level = Full
  Pool = AI-Consolidated
  Selection Type = PoolUncopiedJobs
  Messages = Standard
}

As can be seen in the plot, the copy job creates a copy of the current full backup that is available and is already 7 days old.

image

The other disadvantage is, that it copies all jobs, not only the virtual full jobs. It also includes the virtual incremental jobs from this pool.

3.9.6.2. Virtual Full Jobs

The alternative to Copy Jobs is creating a virtual Full Backup Job when the data should be stored in a long-term pool.

bareos-dir job VirtualLongtermFull
Job {
  Name = "VirtualLongtermFull"
  Client = bareos-fd
  FileSet = SelfTest
  Schedule = LongtermFull
  Type = Backup
  Level = VirtualFull
  Pool = AI-Consolidated
  Messages = Standard

  Priority = 13                 # run after  Consolidate
  Run Script {
        console = "update jobid=%i jobtype=A"
        Runs When = After
        Runs On Client = No
        Runs On Failure = No
  }
}

To make sure the longterm LevelDirJob = VirtualFull is not taken as base for the next incrementals, the job type of the copied job is set to TypeDirJob = Archive with the Run ScriptDirJob .

As can be seen on the plot, the LevelDirJob = VirtualFull archives the current data, i.e. it consolidates the full and all incrementals that are currently available.

image

3.10. How to manually transfer data/volumes

The always incremental backup scheme minimizes the amount of data that needs to be transferred over the wire.

This makes it possible to backup big filesystems over small bandwidths.

The only challenge is to do the first full backup.

The easiest way to transfer the data is to copy it to a portable data medium (or even directly store it on there) and import the data into the local bareos catalog as if it was backed up from the original client.

This can be done in two ways

  1. Install a storage daemon in the remote location that needs to be backed up and connect it to the main director. This makes it easy to make a local backup in the remote location and then transfer the volumes to the local storage. For this option the communication between the local director and the remote storage daemon needs to be possible.

    image

  2. Install a director and a storage daemon in the remote location. This option means that the backup is done completely independent from the local director and only the volume is then transferred and needs to be imported afterwards.

    image

3.10.1. Import Data from a Remote Storage Daemon

First setup client, fileset, job and schedule as needed for a always incremental backup of the remote client.

Run the first backup but make sure that you choose the remote storage to be used.

run
*run job=BackupClient-remote level=Full storage=File-remote

Transport the volumes that were used for that backup over to the local storage daemon and make them available to the local storage daemon. This can be either by putting the tapes into the local changer or by storing the file volumes into the local file volume directory.

If copying a volume to the local storage directory make sure that the file rights are correct.

Now tell the director that the volume now belongs to the local storage daemon.

List volumes shows that the volumes used still belong to the remote storage:

list volumes
*list volumes
.....
Pool: Full
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------------------+-------------+
| MediaId | VolumeName | VolStatus | Enabled | VolBytes | VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType | LastWritten         | Storage     |
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------------------+-------------+
| 1       | Full-0001  | Append    | 1       | 38600329 | 0        | 31536000     | 1       | 0    | 0         | File      | 2016-07-28 14:00:47 | File-remote |
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------------------+-------------+

Use update volume to set the right storage and check with list volumes that it worked:

update volume
*update volume=Full-0001 storage=File
*list volumes
...
Pool: Full
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------------------+---------+
| MediaId | VolumeName | VolStatus | Enabled | VolBytes | VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType | LastWritten         | Storage |
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------------------+---------+
| 1       | Full-0001  | Append    | 1       | 38600329 | 0        | 31536000     | 1       | 0    | 0         | File      | 2016-07-28 14:00:47 | File    |
+---------+------------+-----------+---------+----------+----------+--------------+---------+------+-----------+-----------+---------------------+---------+

Now the remote storage daemon can be disabled as it is not needed anymore.

The next incremental run will take the previously taken full backup as reference.

3.10.2. Import Data from a Independent Remote Full Bareos Installation

If a network connection between the local director and the remote storage daemon is not possible, it is also an option to setup a fully functional Bareos installation remotely and then to import the created volumes. Of course the network connection between the Bareos Director Daemon and the Bareos File Daemon is needed in any case to make the incremental backups possible.

  • Configure the connection from local Bareos Director Daemon to remote Bareos File Daemon, give the remote client the same name as it was when the data was backed up.
  • Add the Fileset created on remote machine to local machine.
  • Configure the Job that should backup the remote client with the fileset.
  • Run estimate listing on the remote backup job.
  • Run list filesets to make sure the fileset was added to the catalog.

Then we need to create a backup on the remote machine onto a portable disk which we can then import into our local installation.

On remote machine:

  • Install full Bareos server on remote server (sd, fd, dir). Using the Sqlite backend is sufficient.
  • Add the client to the remote backup server.
  • Add fileset which the client will be backed up.
  • Add Pool with name transferDirpool where the data will be written to.
  • create job that will backup the remote client with the remote fileset into the new pool
  • Do the local backup using the just created Pool and Filesets.

Transport the newly created volume over to the director machine (e.g. via external harddrive) and store the file where the device stores its files (e.g. /var/lib/bareos/storage)

Shutdown Director on local director machine.

Import data form volume via bscan, you need to set which database backend is used: bscan -B sqlite3 FileStorage -V Transfer-0001 -s -S

If the import was successfully completed, test if an incremental job really only backs up the minimum amount of data.