Speedy recovery without Backups

Backups are often thought about in the context of a disaster, the loss of a data centre, SAN or other infrastructure. Fortunately for operators and admins everywhere these are relatively rare occurrences. Backups are also exist to give comfort in the form of regulatory and audit compliance.

However, the most common active use of backups is not for any of these purposes, but for data issues that do not compromise a whole system. These can come from poor code introducing data corruption, malicious action as well as simple fat-finger or user error.

In these cases the effort to recover the data is often significant, additionally the data corruption can be replicated to others systems via replication or log-shipping meaning these are not always viable targets to retrieve data. The effort required to retrieve data from backups can sometimes mean that recovery is bypassed for simple expediency as it's quicker and easier to try to manually fix or re-create the data.

Even when the data is important enough to recover, this can be a long an unwieldy process. If the existing DB is to be kept running and serving data, then another location for the restore is required. This mean disk and CPU resources have to be requisitioned, often from other systems such as operational support or development, inhibiting existing work and impacting project timelines. There is then the time required to recover the data from the backup and insert it into the database.

In these cases using a time-travelling (or flashback) query can save a significant amount of time for the DBA and allow for data recovery to take place without impacting the rest of the application.

In short the DBA takes data from just before the corruption/data loss event and then uses this data to fix the data without having to rely on recovering backups and provisioning recovery areas.

Within CockroachDB this is achieved using As Of System Time (AOST) queries.

How does CockroachDB do this?

Although CockroachDB presents itself as a single Postgres wire-compatible distributed SQL database, it's shared-nothing architecture means that each individual node is a key-value store where the data is broken into blocks of contiguous key values called ranges. Each range is replicated (tuneable, but defaults to 3 replicas) around the cluster to provide resilience using the RAFT protocol.

As data is created, updated or deleted, rather than make the changes in the effected ranges directly, CockroachDB uses a Log-structured merge-tree approach to tombstone the reference to the old data and create a new range with the updated data. Then, in the background the garbage collection process consolidates and merges the ranges.

What this means is that before the garbage collection has been run (default in CockroachDB is 25 hours, but again this is tuneable) this data is not lost from the system. Even data from tables that have been deleted or dropped can be accessed and used to recover data.

How to do it:

The process in CockroachDB is as followed:

 1. Identify the point at which the data was corrupted

 2. Export the relevant data using an AOST query

 3. Import the data into a holding area

 4. Upsert the data from the holding area

Identify the point in time

If you are aware of when the data incident happened (from logs etc.) then this is simple, otherwise a bit of trial and error to pick the point when the data was changed/deleted. The format in Cockroach for time-travelling queries is:

select * from a as of system time '2022-02-28 12:30:15';

Note that the select statement supports the usual array of where clauses:

select * from global_child as of system time '2022-02-28 12:30:15' where value = 'Quins';

More details on specifying other time criteria can be found here.

Export the data

Once the point at which you wish to extract the data has been established the data can be exported to a convenient location such as S3 or a local filesystem using the export feature.

Export to S3:

export into CSV 's3://[BUSKET]/AOST?AWS_ACCESS_KEY_ID=]KEY}&AWS_SECRET_ACCESS_KEY=[SECRET]' FROM SELECT * from a as of system time '2022-02-28 12:30:15';

This CSV file can now be found and accessed in the S3 bucket:

Import the data

Now we need a structure to import the data into, by using a holding area rather than just importing into the original table location we can have fine-grain control of the data. So we create an empty table with the same structure as the source table, and then import the data. One way of getting the appropriate DDL is to use the following command:

show create table [tablename];

Once the table is created we can import the data into the holding table:

IMPORT INTO ajs.a_restore(id, city, parent_id, value) CSV data ('s3://ajs-demo/AOST/[CSV Filename]?AWS_ACCESS_KEY_ID=[KEY]&AWS_SECRET_ACCESS_KEY=[SECRET]');

We now have a copy of the data in the restore table we can use to insert into or update rows in the live table.

Additionally, because of the way that cockroach works with it KV store/ranges this also applies to dropped tables or DDL changes such as drop column statements.