The HStore table within the GaussDB(DWS) hybrid data warehouse offers binlog to facilitate the capture of database events. This enables the export of incremental data to third-party components like Flink. By consuming binlog data, you can synchronize upstream and downstream data, improving data processing efficiency.
Unlike traditional MySQL binlog, which logs all database changes and focuses on data recovery and replication. The GaussDB(DWS) hybrid data warehouse binlog is optimized for real-time data synchronization, recording DML operations—Insert, Delete, Update, and Upsert—while excluding DDL operations.
GaussDB(DWS) Binlog has the following advantages:
With Flink's real-time processing capabilities and Binlog, you can build a hybrid data warehouse efficiently without additional components like Kafka. The architecture is streamlined, and data flows efficiently, driven by Flink SQL.
but these will reset incremental data and synchronization details.
Field |
Type |
Description |
|---|---|---|
gs_binlog_sync_point |
BIGINT |
Binlog system field, which indicates the synchronization point. In common GTM mode, the value is unique and ordered. |
gs_binlog_event_sequence |
BIGINT |
Binlog system field, which indicates the sequence of operations of the same transaction type. |
|
gs_binlog_event_type |
CHAR |
Binlog system field, which indicates the operation type of the current record. The options are as follows:
|
gs_binlog_timestamp_us |
BIGINT |
System field of Binlog, indicating the timestamp when the current record is saved to the database. This field is available only when the Binlog timestamp function is enabled. If the Binlog timestamp function is disabled, this field is left blank. Only 9.1.0.200 and later versions support this function. |
user_column_1 |
User column |
User-defined data column |
... |
... |
... |
usert_column_n |
User column |
User-defined data column |
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE hstore_binlog_source ( c1 INT PRIMARY KEY, c2 INT, c3 INT ) WITH ( ORIENTATION = COLUMN, enable_hstore_opt=true, enable_binlog=on, binlog_ttl = 86400 ); |
Run the ALTER command to enable the binlog function for an existing HStore table.
1 2 3 4 5 6 7 8 9 | CREATE TABLE hstore_binlog_source ( c1 INT PRIMARY KEY, c2 INT, c3 INT ) WITH ( ORIENTATION = COLUMN, enable_hstore_opt=true ); ALTER TABLE hstore_binlog_source SET (enable_binlog=on); |
You can use the system functions provided by GaussDB(DWS) to query the binlog information of the target table on a specified DN and check whether the binlog is consumed by downstream processes.
1 2 3 4 5 6 7 8 9 10 | -- Simulate Flink to call a system function to obtain the synchronization point. The parameters indicate the table name, slot name, whether the point is a checkpoint, and target DN (0 indicates all DNs). select * from pg_catalog.pgxc_get_binlog_sync_point('hstore_binlog_source', 'slot1', false, 0); select * from pg_catalog.pgxc_get_binlog_sync_point('hstore_binlog_source', 'slot1', true, 0); -- Incremental binlogs are generated after additions, deletions, and modifications. INSERT INTO hstore_binlog_source VALUES(100, 1, 1); delete hstore_binlog_source where c1 = 100; INSERT INTO hstore_binlog_source VALUES(200, 1, 1); update hstore_binlog_source set c2 =2 where c1 = 200; -- Simulate Flink to call a system function to query the binlog of a specified CSN range. The parameters indicate the table name, target DN (0 indicates all DNs), start CSN point, and end CSN point. select * from pgxc_get_binlog_changes('hstore_binlog_source', 0, 0 , 9999999999); |

Two INSERT operations generate two records with gs_binlog_event_type as I. The DELETE operation generates a record whose type is d. The UPDATE operation generates a B record for BeforeUpdate and a U record for AfterUpdate, indicating the values before and after the update.
You can call the system function pgxc_consumed_binlog_records to check whether the binlogs of the target table are consumed by all slots. The parameters indicate the target table name and target DN (0 indicates all DNs).
1 2 3 4 5 | -- Simulate Flink to call the system function to register a synchronization point. The parameters indicate the table name, slot name, registered point, whether the point is a checkpoint, and xmin corresponding to the point (provided when the synchronization point is obtained). select pgxc_register_binlog_sync_point('hstore_binlog_source', 'slot1', 0, 9999999999, false, 100); select pgxc_register_binlog_sync_point('hstore_binlog_source', 'slot1', 0, 9999999999, true, 100); -- Check whether all binlogs in the table are consumed. If 1 is returned, all binlogs have been consumed by downstream slots. select * from pgxc_consumed_binlog_records('hstore_binlog_source',0); |

1 2 3 4 5 6 7 8 9 10 | CREATE TABLE hstore_binlog_source( c1 INT PRIMARY KEY, c2 INT, c3 INT ) WITH ( ORIENTATION = COLUMN, enable_hstore_opt=true, enable_binlog_timestamp =on, binlog_ttl = 86400 ); |
Query the binlog on the table where the binlog timestamp function is enabled.

Convert gs_binlog_timestamp_us from the BigInt type to a readable timestamp.
1 | select to_timestamp(1731569598408661/1000000); |

To obtain the first binlog information of the target table after the specified time point on each DN (if the value is empty, no binlog exists after the time point).
1 | select * from pgxc_get_binlog_cursor_by_timestamp('hstore_binlog_source','2024-11-14 15:33:18.40866+08', 0); |

Obtain the consumption progress of the table for which the binlog timestamp function is enabled.
The returned fields indicate the timestamp of the latest consumed binlog, the latest timestamp on the binlog, the CSN point of the latest consumed binlog, the latest CSN point on the binlog, and the number of unconsumed binlog records.
1 2 3 4 5 | -- Simulate Flink to call the system function to register a synchronization point. The parameters indicate the table name, slot name, registered point, whether the point is a checkpoint, and xmin corresponding to the point (provided when the synchronization point is obtained). select pgxc_register_binlog_sync_point('hstore_binlog_source', 'slot1', 0, 9999999999, false, 100); select pgxc_register_binlog_sync_point('hstore_binlog_source', 'slot1', 0, 9999999999, true, 100); -- Query the consumption progress of each slot in the target table. select * from pgxc_get_binlog_consume_progress('hstore_binlog_source', 0); |

You can set the session-level parameter enable_generate_binlog to off to control the DML of the current session. When a table for which binlog is enabled is imported to the database, no binlog record is generated.