Tiger Cloud: Performance, Scale, Enterprise

Self-hosted products

MST

Hypertables are Postgres tables in TimescaleDB that automatically partition your time-series data by time. Time-series data represents the way a system, process, or behavior changes over time. Hypertables enable TimescaleDB to work efficiently with time-series data. Each hypertable is made up of child tables called chunks. Each chunk is assigned a range of time, and only contains data from that range. When you run a query, TimescaleDB identifies the correct chunk and runs the query on it, instead of going through the entire table.

Hypercore is the hybrid row-columnar storage engine in TimescaleDB used by hypertables. Traditional databases force a trade-off between fast inserts (row-based storage) and efficient analytics (columnar storage). Hypercore eliminates this trade-off, allowing real-time analytics without sacrificing transactional capabilities.

Hypercore dynamically stores data in the most efficient format for its lifecycle:

  • Row-based storage for recent data: the most recent chunk (and possibly more) is always stored in the rowstore, ensuring fast inserts, updates, and low-latency single record queries. Additionally, row-based storage is used as a writethrough for inserts and updates to columnar storage.
  • Columnar storage for analytical performance: chunks are automatically compressed into the columnstore, optimizing storage efficiency and accelerating analytical queries.

Unlike traditional columnar databases, hypercore allows data to be inserted or modified at any stage, making it a flexible solution for both high-ingest transactional workloads and real-time analytics—within a single database.

Because TimescaleDB is 100% Postgres, you can use all the standard Postgres tables, indexes, stored procedures, and other objects alongside your hypertables. This makes creating and working with hypertables similar to standard Postgres.

Hypertable structure

Each hypertable is partitioned into child hypertables called chunks. Each chunk is assigned a range of time, and only contains data from that range.

By default, each hypertable chunk holds data for 7 days. You can change this to better suit your needs. For example, if you set chunk_interval to 1 day, each chunk stores data for a single day.

The following figure shows the difference in structure between a relational table and a hypertable:

Compare a relational table to a hypertable

TimescaleDB divides time into potential chunk ranges, based on the chunk_interval. Each hypertable chunk holds data for a specific time range only. When you insert data from a time range that doesn't yet have a chunk, TimescaleDB automatically creates a chunk to store it.

In practice, this means that the start time of your earliest chunk does not necessarily equal the earliest timestamp in your hypertable. Instead, there might be a time gap between the start time and the earliest timestamp. This doesn't affect your usual interactions with your hypertable, but might affect the number of chunks you see when inspecting it.

Partitioning on time is the most common use case for hypertable, but it may not be enough for your needs. For example, you may need to scan for the latest readings that match a certain condition without locking a critical hypertable. Best practice to optimize ingest and query performance is to add a partitioning dimension on a non-time column such as location or device UUID, and specify a number of partitions.

You add a partitioning dimension at the same time as you create the hypertable, when the table is empty. The good news is that although you select the number of partitions at creation time, as your data grows you can change the number of partitions later and improve query performance. Changing the number of partitions only affects chunks created after the change, not existing chunks. To set the number of partitions for a partitioning dimension, call set_number_partitions. For example:

  1. Create the hypertable with 1 day interval chunk interval

    CREATE TABLE conditions(
    "time" timestamptz not null,
    device_id integer,
    temperature float
    )
    WITH(
    timescaledb.hypertable,
    timescaledb.partition_column='time',
    timescaledb.chunk_interval='1 day'
    );
  2. Add a hash partition on a non-time column

    select * from add_dimension('conditions', by_hash('device_id', 3));

    Now use your hypertable as usual, but you can also ingest and query efficiently by the device_id column.

  3. Change the number of partitions as you data grows

    select set_number_partitions('conditions', 5, 'device_id');

Best practices for maintaining a high performance when scaling include:

  • Limit the number of hypertables in your service; having tens of thousands of hypertables is not recommended.
  • Choose a strategic chunk size.

Chunk size affects insert and query performance. You want a chunk small enough to fit into memory so you can insert and query recent data without reading from disk. However, having too many small and sparsely filled chunks can affect query planning time and compression. The more chunks in the system, the slower that process becomes, even more so when all those chunks are part of a single hypertable.

Postgres builds the index on the fly during ingestion. That means that to build a new entry on the index, a significant portion of the index needs to be traversed during every row insertion. When the index does not fit into memory, it is constantly flushed to disk and read back. This wastes IO resources which would otherwise be used for writing the heap/WAL data to disk.

The default chunk interval is 7 days. However, best practice is to set chunk_interval so that prior to processing, the indexes for chunks currently being ingested into fit within 25% of main memory. For example, on a system with 64 GB of memory, if index growth is approximately 2 GB per day, a 1-week chunk interval is appropriate. If index growth is around 10 GB per day, use a 1-day interval.

You set chunk_interval when you create a hypertable, or by calling set_chunk_time_interval on an existing hypertable.

For a detailed analysis of how to optimize your chunk sizes, see the blog post on chunk time intervals. To learn how to view and set your chunk time intervals, see how to Optimize hypertable chunk intervals.

By default, indexes are automatically created when you create a hypertable. The default index is on time, descending. You can prevent index creation by setting the create_default_indexes option to false.

Hypertables have some restrictions on unique constraints and indexes. If you want a unique index on a hypertable, it must include all the partitioning columns for the table. To learn more, see Enforce constraints with unique indexes on hypertables.

You can prevent index creation by setting the create_default_indexes option to false.

This section shows you:

Keywords

Found an issue on this page?Report an issue or Edit this page in GitHub.