High availability and read replication

Tiger Cloud: Performance, Scale, Enterprise

Self-hosted products

MST

In Tiger Cloud, replicas are copies of the primary data instance in a Tiger Cloud service. If your primary becomes unavailable, Tiger Cloud automatically fails over to your HA replica.

The replication strategies offered by Tiger Cloud are:

High Availability(HA) replicas: significantly reduce the risk of downtime and data loss due to system failure, and enable services to avoid downtime during routine maintenance.
Read replicas: safely scale a service to power your read-intensive apps and business intelligence tooling and remove the load from the primary data instance.
For MST, see Failover in Managed Service for TimescaleDB. For self-hosted TimescaleDB, see Replication and high availability.

Rapid recovery

By default, all services have rapid recovery enabled.

Because compute and storage are handled separately in Tiger Cloud, services recover quickly from compute failures, but usually need a full recovery from backup for storage failures.

Compute failure: the most common cause of database failure. Compute failures can be caused by hardware failing, or through things like unoptimized queries, causing increased load that maxes out the CPU usage. In these cases, data on disk is unaffected and only the compute and memory needs replacing. Tiger Cloud recovery immediately provisions new compute infrastructure for the service and mounts the existing storage to the new node. Any WAL that was in memory then replays. This process typically only takes thirty seconds. However, depending on the amount of WAL that needs replaying this may take up to twenty minutes. Even in the worst-case scenario, Tiger Cloud recovery is an order of magnitude faster than a standard recovery from backup.
Storage failure: in the rare occurrence of disk failure, Tiger Cloud automatically performs a full recovery from backup.

If CPU usage for a service runs high for long periods of time, issues such as WAL archiving getting queued behind other processes can occur. This can cause a failure and could result in a larger data loss. To avoid data loss, services are monitored for this kind of scenario.

High availability and read replication

Rapid recovery

Related Content