3. Full Import and Streaming Incremental Synchronization from PostgreSQL to Klustron
3. Full Import and Streaming Incremental Synchronization from PostgreSQL to Klustron
01 Environment
Migrate business data from native PostgreSQL database to the Klustron cluster. Since Klustron is compatible with the PostgreSQL protocol, it can directly support PostgreSQL data import/export tools: pg_dump and pg_restore.
However, since Klustron currently only supports PostgreSQL version 11.5, for incompatible versions: newer version tools can be compatible with older versions, while older version tools do not support newer versions.
For online real-time data synchronization, Klustron supports Oracle GoldenGate, which can be used to synchronize data from PostgreSQL or other databases (Oracle/SQL Server/MySQL, etc.) to Klustron.
Klustron supports importing data from MySQL instances into the Klustron cluster, including full static import and full import with streaming incremental synchronization. For more details, please refer to Import and Streaming Incremental Synchronization from MySQL to Klustron
Klustron supports importing data from a TiDB cluster to a Klustron cluster, including full static import and full import with streaming incremental synchronization. For more information, please refer to Importing Data from TiDB 4.x and Earlier Versions to Klustron and Importing Data from TiDB 5 and Later Versions to Klustron
02 Steps for Data Export and Import
Due to the fact that Klustron requires tables to have primary keys when inserting data, to ensure the success of data migration, the process of exporting data from the native PostgreSQL database needs to be performed in two steps (first, export the data structure, and second, export the actual data).
During the recovery process, the data structure is restored first, and then the data is imported. Below are two scenarios demonstrating the specific operation instructions.
2.1 Export tables
2.1.1 Export table structure
pg_dump -h 192.168.0.142 -p 5432 -tcustomer1_3 -t customer1_4 postgres-s>customer1_34str.sql (-s parameter is to export only the table structure)
2.1.2 Export table data
pg_dump -h 192.168.0.142 -p 5432 -tcustomer1_3 -t customer1_4 postgres -a >customer1_34_data.sq
2.1.3 Restore table structure
psql --dbname=bktest3 --host=192.168.0.142 --port=5401 -U abc -f customer1_34str.sql
2.1.4 Restore table data
psql --dbname=bktest3 --host=192.168.0.142 --port=5401 -U abc -f customer1_34_data.sql
2.2 Export a full database
2.2.1 Export the structure of the full database
pg_dump -d postgres -U postgres -s>postgres_str.sql
2.2.2 Export the structure of the full database
pg_dump -d postgres -U postgres -a>postgres_data.sql
Alternatively, data can be exported in dump or tar format to reduce space and improve performance:
pg_dump -d postgres -U postgres -a -Fc >postgres_data.dump (尺寸最小)
pg_dump -d postgres -U postgres -a -Ft >postgres_data.tar
2.2.3 Restore the full database structure
psql --dbname=bktest2 --host=192.168.0.142 --port=5401 -U abc -f postgres_str.sql
2.2.4 Restore the full database data
psql --dbname=bktest2 --host=192.168.0.142 --port=5401 -U abc -f postgres_data.sql
If the data was exported in dump format, pg_restore
should be used:
pg_restore -h 192.168.0.142 -p 5401 -U abc-d bktest2 postgres_data.dump
(Note that pg_restore
can only be backwards compatible. Files exported by higher version pg_dump
cannot be restored with lower version pg_restore
.)
03 Oracle GoldenGate Real-time Synchronization from PostgreSQL to Klustron
3.1 Install and deploy Oracle GoldenGate
You need to download and install the OGG for PostgreSQL microservices version, oggs_Linux_x64_PostgreSQL_services_shiphome version. Follow the official installation manual to install it.
The installation and deployment architecture is shown in the following figure:
3.2 Configure PostgreSQL data source and Klustron target
Source database information: IP address 192.168.0.10, replication user: postgres, database name: PGSOUR
Target database information: IP address 192.168.0.12, replication user: postgres, database name: Klustron
Set the database parameters:
alter system set wal_level='logical';
alter system set max_replication_slots = 20;
Write OGG odbc.ini file:
[ODBC Data Sources]
PGDSN1=DataDirect 14 PostgreSQL Wire Protocol
PGDSN2=DataDirect 11 PostgreSQL Wire Protocol
[ODBC]
IANAAppCodePage=106
InstallDir=/ogg21/ogg_ma
[PGDSN1]
Driver=/ogg21/ogg_ma/lib/GGpsql25.so
Description=DataDirect 14 PostgreSQL Wire Protocol
Database=PGSOUR
HostName=192.168.0.10
PortNumber=5432
LogonID=postgres
Password=Klustrontest
[PGDSN2]
Driver=/ogg21/ogg_ma/lib/GGpsql25.so
Description=DataDirect 11 PostgreSQL Wire Protocol
Database=Klustron
HostName=192.168.0.12
PortNumber=5401
LogonID=postgres
Password=Klustrontest
3.3 Configure the replication environment
Create identity credentials and checkpoint table:
Login:
3.4 Configure and start the extract process
Extract process configuration file:
extract ext1
SETENV(PGCLIENTENCODING = "UTF8" )
SETENV(ODBCINI="/ogg21/ogg_ma/odbc.ini" )
SOURCEDB PGDSN1 USERIDALIAS PG1, DOMAIN OGGMA
exttrail ./dirdat/e1
IGNOREREPLICATES
TRANLOGOPTIONS FILTERTABLE ogg.ckpt
table public.*;
3.5 Configure and start the replicat process
Replicat process configuration file:
replicat rep1
SETENV(PGCLIENTENCODING = "UTF8" )
SETENV(ODBCINI="/ogg21/ogg_ma/odbc.ini" )
SETENV(NLS_LANG="AMERICAN_AMERICA.AL32UTF8")
TARGETDB PGDSN2 USERIDALIAS PG2, DOMAIN OGGMA
map public.*, target public.*;
3.6 Perform Synchronization and Check Synchronization
You can insert data into the source database and check the results in the target database:
You can also check the replication progress through the management interface:
3.7 Stress Testing
Stress testing includes two parts (due to the low configuration of the test virtual machine environment, the amount of data tested is relatively small).
On the source side, use sysbench to insert real-time data into PostgreSQL and check the update of data on the target side Klustron.
The target database runs under certain loads.
Execute data loading on the source side:
sysbench oltp_read_write --table-size=100000 --tables=10 --pgsql-host=192.168.0.10 --pgsql-port=5432 --pgsql-user=postgres --pgsql-password=Klustrontest --pgsql-db=PGSOURE --db-ps-mode=disable --threads=1 --db-driver=pgsql prepare
Target database executes stress testing scripts:
sysbench oltp_read_write --table-size=100 --tables=10 --pgsql-host=192.168.0.12 --pgsql-port=5401 --pgsql-user=abc --pgsql-password=abc --pgsql-db=postgres --db-ps-mode=disable --threads=10 --db-driver=pgsql --time=600 run
Target side checks data synchronization status:
select count(*) from sbtest10;
count
-------
100000
(1 row)
The test conclusion is that the real-time synchronization from PostgreSQL to Klustron using Oracle GoldenGate is successful and stable.