Tiger Cloud: Scale, Enterprise

Self-hosted products

MST

Tiger Lake enables you to build real-time applications alongside efficient data pipeline management within a single system. Tiger Lake unifies the Tiger Cloud operational architecture with data lake architectures.

This experimental release is a native integration enabling synchronization between hypertables and relational tables running in Tiger Cloud services to Iceberg tables running in Amazon S3 Tables in your AWS account.

Early access

To follow the steps on this page:

To connect a Tiger Cloud service to your data lake:

When you start streaming, all data in the table is synchronized to Iceberg. Records are imported in time order, from oldest to youngest. The write throughput is approximately 40.000 records / second. For larger tables, a full import can take some time.

For Iceberg to perform update or delete statements, your hypertable or relational table must have a primary key. This includes composite primary keys.

To stream data from a Postgres relational table, or a hypertable in your Tiger Cloud service to your data lake, run the following statement:

ALTER TABLE <table_name> SET (
tigerlake.iceberg_sync = true | false,
tigerlake.iceberg_partitionby = '<partition_specification>'
)
  • tigerlake.iceberg_sync: boolean, set to true to start streaming, or false to stop the stream. A stream cannot resume after being stopped.
  • tigerlake.iceberg_partitionby: optional property to define a partition specification in Iceberg. By default the Iceberg table is partitioned as day(<time-column of $HYPERTABLE>). This default behavior is only applicable
    to hypertables. For more information, see partitioning.

By default, the partition interval for an Iceberg table is one day(time-column) for a hypertable. Postgres table sync does not enable any partitioning in Iceberg for non-hypertables. You can set it using tigerlake.iceberg_partitionby. The following partition intervals and specifications are supported:

IntervalDescriptionSource types
hourExtract a date or timestamp day, as days from epoch. Epoch is 1970-01-01.date, timestamp, timestamptz
dayExtract a date or timestamp day, as days from epoch.date, timestamp, timestamptz
monthExtract a date or timestamp day, as days from epoch.date, timestamp, timestamptz
yearExtract a date or timestamp day, as days from epoch.date, timestamp, timestamptz
truncate[W]Value truncated to width W, see options

These partitions define the behavior using the Iceberg partition specification:

The following samples show you how to tune data sync from a hypertable or a Postgres relational table to your data lake:

  • Sync a hypertable with the default one-day partitioning interval on the ts_column column

    To start syncing data from a hypertable to your data lake using the default one-day chunk interval as the partitioning scheme to the Iceberg table, run the following statement:

    ALTER TABLE my_hypertable SET (tigerlake.iceberg_sync = true);

    This is equivalent to day(ts_column).

  • Specify a custom partitioning scheme for a hypertable

    You use the tigerlake.iceberg_partitionby property to specify a different partitioning scheme for the Iceberg table at sync start. For example, to enforce an hourly partition scheme from the chunks on ts_column on a hypertable, run the following statement:

    ALTER TABLE my_hypertable SET (
    tigerlake.iceberg_sync = true,
    tigerlake.iceberg_partitionby = 'hour(ts_column)'
    );
  • Set the partition to sync relational tables

    Postgres relational tables do not forward a partitioning scheme to Iceberg, you must specify the partitioning scheme using tigerlake.iceberg_partitionby when you start the sync. For example, for a standard Postgres table to sync to the Iceberg table with daily partitioning , run the following statement:

    ALTER TABLE my_postgres_table SET (
    tigerlake.iceberg_sync = true,
    tigerlake.iceberg_partitionby = 'day(timestamp_col)'
    );
  • Stop sync to an Iceberg table for a hypertable or a Postgres relational table

    ALTER TABLE my_hypertable SET (tigerlake.iceberg_sync = false);
  • Only Postgres 17.4 is supported. Services running Postgres 17.5 are downgraded to 17.4.
  • Amazon S3 Tables Iceberg REST catalog only is supported.
  • In order to collect deletes made to data in the columstore, certain columnstore optimizations are disabled for hypertables.
  • The TRUNCATE statement is not supported, and does not truncate data in the corresponding Iceberg table.
  • Data in a hypertable that has been moved to the low-cost object storage tier is not synced.
  • Renaming a table in Postgres stops the sync to Iceberg and causes unexpected behavior.
  • Writing to the same S3 table bucket from multiple services is not supported, bucket-to-service mapping is one-to-one.
  • Iceberg snapshots are pruned automatically if the amount exceeds 2500.

Keywords

Found an issue on this page?Report an issue or Edit this page in GitHub.